Cluster Analysis for Corpus Linguistics

Cluster Analysis for Corpus Linguistics
Author: Hermann Moisl
Publisher: Walter de Gruyter GmbH & Co KG
Total Pages: 319
Release: 2015-02-24
Genre: Language Arts & Disciplines
ISBN: 3110393174

The standard scientific methodology in linguistics is empirical testing of falsifiable hypotheses. As such the process of hypothesis generation is central, and involves formulation of a research question about a domain of interest and statement of a hypothesis relative to it. In corpus linguistics the domain is text, and generation involves abstraction of data from text, data analysis, and formulation of a hypothesis based on inference from the results. Traditionally this process has been paper-based, but the advent of electronic text has increasingly rendered it obsolete both because the size of digital corpora is now at or beyond the limit of what can efficiently be used in the traditional way, and because the complexity of data abstracted from them can be impenetrable to understanding. Linguists are increasingly turning to mathematical and statistical computational methods for help, and cluster analysis is such a method. It is used across the sciences for hypothesis generation by identification of structure in data which are too large or complex, or both, to be interpretable by direct inspection. This book aims to show how cluster analysis can be used for hypothesis generation in corpus linguistics, thereby contributing to a quantitative empirical methodology for the discipline.

Cluster Analysis for Corpus Linguistics

Cluster Analysis for Corpus Linguistics
Author: Hermann Moisl
Publisher: Walter de Gruyter
Total Pages: 381
Release: 2015-01-16
Genre:
ISBN: 9783110363821

The rapidly growing volume of digital natural language text and the complexity of data abstracted from it have increasingly rendered traditional corpus linguistic analytical methodology obsolete. This book describes a cluster analytic methodology for generating linguistic hypotheses on the basis of data abstracted from language corpora.

Corpus Linguistics and Statistics with R

Corpus Linguistics and Statistics with R
Author: Guillaume Desagulier
Publisher: Springer
Total Pages: 359
Release: 2017-11-17
Genre: Computers
ISBN: 3319645722

This textbook examines empirical linguistics from a theoretical linguist’s perspective. It provides both a theoretical discussion of what quantitative corpus linguistics entails and detailed, hands-on, step-by-step instructions to implement the techniques in the field. The statistical methodology and R-based coding from this book teach readers the basic and then more advanced skills to work with large data sets in their linguistics research and studies. Massive data sets are now more than ever the basis for work that ranges from usage-based linguistics to the far reaches of applied linguistics. This book presents much of the methodology in a corpus-based approach. However, the corpus-based methods in this book are also essential components of recent developments in sociolinguistics, historical linguistics, computational linguistics, and psycholinguistics. Material from the book will also be appealing to researchers in digital humanities and the many non-linguistic fields that use textual data analysis and text-based sensorimetrics. Chapters cover topics including corpus processing, frequencing data, and clustering methods. Case studies illustrate each chapter with accompanying data sets, R code, and exercises for use by readers. This book may be used in advanced undergraduate courses, graduate courses, and self-study.

Statistics in Corpus Linguistics

Statistics in Corpus Linguistics
Author: Vaclav Brezina
Publisher: Cambridge University Press
Total Pages: 317
Release: 2018-09-20
Genre: Foreign Language Study
ISBN: 1107125707

A comprehensive and accessible introduction to statistics in corpus linguistics, covering multiple techniques of quantitative language analysis and data visualisation.

Corpus Linguistics

Corpus Linguistics
Author: Tony McEnery
Publisher: Cambridge University Press
Total Pages: 311
Release: 2011-10-06
Genre: Language Arts & Disciplines
ISBN: 1139502441

Corpus linguistics is the study of language data on a large scale - the computer-aided analysis of very extensive collections of transcribed utterances or written texts. This textbook outlines the basic methods of corpus linguistics, explains how the discipline of corpus linguistics developed and surveys the major approaches to the use of corpus data. It uses a broad range of examples to show how corpus data has led to methodological and theoretical innovation in linguistics in general. Clear and detailed explanations lay out the key issues of method and theory in contemporary corpus linguistics. A structured and coherent narrative links the historical development of the field to current topics in 'mainstream' linguistics. Practical tasks and questions for discussion at the end of each chapter encourage students to test their understanding of what they have read and an extensive glossary provides easy access to definitions of technical terms used in the text.

Statistics in Corpus Linguistics

Statistics in Corpus Linguistics
Author: Vaclav Brezina
Publisher: Cambridge University Press
Total Pages: 317
Release: 2018-09-20
Genre: Language Arts & Disciplines
ISBN: 1108638627

Do you use language corpora in your research or study, but find that you struggle with statistics? This practical introduction will equip you to understand the key principles of statistical thinking and apply these concepts to your own research, without the need for prior statistical knowledge. The book gives step-by-step guidance through the process of statistical analysis and provides multiple examples of how statistical techniques can be used to analyse and visualise linguistic data. It also includes a useful selection of discussion questions and exercises which you can use to check your understanding. The book comes with a Companion website, which provides additional materials (answers to exercises, datasets, advanced materials, teaching slides etc.) and Lancaster Stats Tools online (http://corpora.lancs.ac.uk/stats), a free click-and-analyse statistical tool for easy calculation of the statistical measures discussed in the book.

Glossary of Corpus Linguistics

Glossary of Corpus Linguistics
Author: Paul Baker
Publisher: Edinburgh University Press
Total Pages: 192
Release: 2006-05-19
Genre: Language Arts & Disciplines
ISBN: 0748626905

This alphabetic guide provides definitions and discussion of key terms used in corpus linguistics. Corpus data is being used in a growing number of English and Linguistics departments which have no record of past research with corpus data. This is the first comprehensive glossary of the many specialist terms in corpus linguistics and will be useful for corpus linguists and non corpus linguists alike. Clearly written, by a team of experienced academics in the field, the glossary provides full coverage of both traditional and contemporary terminology.

Statistics for Corpus Linguistics

Statistics for Corpus Linguistics
Author: Michael Oakes
Publisher: Edinburgh University Press
Total Pages: 304
Release: 2019-08-06
Genre: Language Arts & Disciplines
ISBN: 1474471382

This book in the Edinburgh Textbooks in Empirical Linguistics series is a comprehensive introduction to the statistics currently used in corpus linguistics. Statistical techniques and corpus applications - whether oriented towards linguistics or language engineering - often go hand in glove, and corpus linguists have used an increasingly wide variety of statistics, drawing on techniques developed in a great many fields. This is the first one-volume introduction to the subject.

The Routledge Handbook of Corpus Linguistics

The Routledge Handbook of Corpus Linguistics
Author: Anne O'Keeffe
Publisher: Routledge
Total Pages: 684
Release: 2022-02-08
Genre: Language Arts & Disciplines
ISBN: 0429632649

The Routledge Handbook of Corpus Linguistics 2e provides an updated overview of a dynamic and rapidly growing area with a widely applied methodology. Over a decade on from the first edition of the Handbook, this collection of 47 chapters from experts in key areas offers a comprehensive introduction to both the development and use of corpora as well as their ever-evolving applications to other areas, such as digital humanities, sociolinguistics, stylistics, translation studies, materials design, language teaching and teacher development, media discourse, discourse analysis, forensic linguistics, second language acquisition and testing. The new edition updates all core chapters and includes new chapters on corpus linguistics and statistics, digital humanities, translation, phonetics and phonology, second language acquisition, social media and theoretical perspectives. Chapters provide annotated further reading lists and step-by-step guides as well as detailed overviews across a wide range of themes. The Handbook also includes a wealth of case studies that draw on some of the many new corpora and corpus tools that have emerged in the last decade. Organised across four themes, moving from the basic start-up topics such as corpus building and design to analysis, application and reflection, this second edition remains a crucial point of reference for advanced undergraduates, postgraduates and scholars in applied linguistics.