Programming for Corpus Linguistics

Programming for Corpus Linguistics
Author: Oliver Mason
Publisher: Edinburgh Textbooks in Empiric
Total Pages: 0
Release: 2000
Genre: Computers
ISBN: 9780748614073

Specialised linguistic research needs can no longer be met by available software. This book enables the researcher to write programs for text and corpus processing, using the popular and easy to learn Java language.

Quantitative Corpus Linguistics with R

Quantitative Corpus Linguistics with R
Author: Stefan Th. Gries
Publisher: Routledge
Total Pages: 257
Release: 2009-03-04
Genre: Education
ISBN: 1135895600

The first textbook of its kind, Quantitative Corpus Linguistics with R demonstrates how to use the open source programming language R for corpus linguistic analyses. Computational and corpus linguists doing corpus work will find that R provides an enormous range of functions that currently require several programs to achieve – searching and processing corpora, arranging and outputting the results of corpus searches, statistical evaluation, and graphing.

Corpus Linguistics

Corpus Linguistics
Author: Douglas Biber
Publisher: Cambridge University Press
Total Pages: 324
Release: 1998-04-23
Genre: Computers
ISBN: 9780521499576

An investigation into the way people use language in speech and writing, this volume introduces the corpus-based approach, which is based on analysis of large databases of real language examples stored on computer.

A Practical Handbook of Corpus Linguistics

A Practical Handbook of Corpus Linguistics
Author: Magali Paquot
Publisher: Springer Nature
Total Pages: 686
Release: 2021-05-04
Genre: Philosophy
ISBN: 3030462161

This handbook is a comprehensive practical resource on corpus linguistics. It features a range of basic and advanced approaches, methods and techniques in corpus linguistics, from corpus compilation principles to quantitative data analyses. The Handbook is organized in six Parts. Parts I to III feature chapters that discuss key issues and the know-how related to various topics around corpus design, methods and corpus types. Parts IV-V aim to offer a user-friendly introduction to the quantitative analysis of corpus data: for each statistical technique discussed, chapters provide a practical guide with R and come with supplementary online material. Part VI focuses on how to write a corpus linguistic paper and how to meta-analyze corpus linguistic research. The volume can serve as a course book as well as for individual study. It will be an essential reading for students of corpus linguistics as well as experienced researchers who want to expand their knowledge of the field.

Practical Corpus Linguistics

Practical Corpus Linguistics
Author: Martin Weisser
Publisher: John Wiley & Sons
Total Pages: 306
Release: 2016-02-16
Genre: Language Arts & Disciplines
ISBN: 1118831888

This is the first book of its kind to provide a practical and student-friendly guide to corpus linguistics that explains the nature of electronic data and how it can be collected and analyzed. Designed to equip readers with the technical skills necessary to analyze and interpret language data, both written and (orthographically) transcribed Introduces a number of easy-to-use, yet powerful, free analysis resources consisting of standalone programs and web interfaces for use with Windows, Mac OS X, and Linux Each section includes practical exercises, a list of sources and further reading, and illustrated step-by-step introductions to analysis tools Requires only a basic knowledge of computer concepts in order to develop the specific linguistic analysis skills required for understanding/analyzing corpus data

Natural Language Processing for Corpus Linguistics

Natural Language Processing for Corpus Linguistics
Author: Jonathan Dunn
Publisher: Cambridge University Press
Total Pages: 149
Release: 2022-03-31
Genre: Language Arts & Disciplines
ISBN: 1009083740

Corpus analysis can be expanded and scaled up by incorporating computational methods from natural language processing. This Element shows how text classification and text similarity models can extend our ability to undertake corpus linguistics across very large corpora. These computational methods are becoming increasingly important as corpora grow too large for more traditional types of linguistic analysis. We draw on five case studies to show how and why to use computational methods, ranging from usage-based grammar to authorship analysis to using social media for corpus-based sociolinguistics. Each section is accompanied by an interactive code notebook that shows how to implement the analysis in Python. A stand-alone Python package is also available to help readers use these methods with their own data. Because large-scale analysis introduces new ethical problems, this Element pairs each new methodology with a discussion of potential ethical implications.

Corpus Linguistics and Statistics with R

Corpus Linguistics and Statistics with R
Author: Guillaume Desagulier
Publisher: Springer
Total Pages: 359
Release: 2017-11-17
Genre: Computers
ISBN: 3319645722

This textbook examines empirical linguistics from a theoretical linguist’s perspective. It provides both a theoretical discussion of what quantitative corpus linguistics entails and detailed, hands-on, step-by-step instructions to implement the techniques in the field. The statistical methodology and R-based coding from this book teach readers the basic and then more advanced skills to work with large data sets in their linguistics research and studies. Massive data sets are now more than ever the basis for work that ranges from usage-based linguistics to the far reaches of applied linguistics. This book presents much of the methodology in a corpus-based approach. However, the corpus-based methods in this book are also essential components of recent developments in sociolinguistics, historical linguistics, computational linguistics, and psycholinguistics. Material from the book will also be appealing to researchers in digital humanities and the many non-linguistic fields that use textual data analysis and text-based sensorimetrics. Chapters cover topics including corpus processing, frequencing data, and clustering methods. Case studies illustrate each chapter with accompanying data sets, R code, and exercises for use by readers. This book may be used in advanced undergraduate courses, graduate courses, and self-study.

Developing Linguistic Corpora

Developing Linguistic Corpora
Author: Martin Wynne
Publisher: Oxbow Books Limited
Total Pages: 100
Release: 2005
Genre: Language Arts & Disciplines
ISBN:

A linguistic corpus is a collection of texts which have been selected and brought together so that language can be studied on the computer. Today, corpus linguistics offers some of the most powerful new procedures for the analysis of language, and the impact of this dynamic and expanding sub-discipline is making itself felt in many areas of language study. In this volume, a selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose. This guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will find the guidelines here useful. It is also aimed at those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their studies.

An Introduction to Corpus Linguistics

An Introduction to Corpus Linguistics
Author: Graeme Kennedy
Publisher: Routledge
Total Pages: 328
Release: 2014-09-19
Genre: Language Arts & Disciplines
ISBN: 1317892585

The use of large, computerized bodies of text for linguistic analysis and description has emerged in recent years as one of the most significant and rapidly-developing fields of activity in the study of language. This book provides a comprehensive introduction and guide to Corpus Linguistics. All aspects of the field are explored, from the various types of electronic corpora that are available to instructions on how to design and compile a corpus. Graeme Kennedy surveys the development of corpora for use in linguistic research, looking back to the pre-electronic age as well as to the massive growth of computer corpora in the electronic age.

Essential Python for Corpus Linguistics

Essential Python for Corpus Linguistics
Author: Mark Johnson
Publisher: Wiley-Blackwell
Total Pages: 208
Release: 2008
Genre: Computers
ISBN: 9781405145640

Linguistic research increasingly relies on large electronic corpora for its primary data. While off-the-shelf programs can perform a set of standard searches, specialized questions usually require a custom-written program to find their answers. Essential Python for Corpus Linguistics uses the programming language Python to explain how to write simple programs that extract linguistically useful information, such as the frequency of a given utterance in a particular context within a corpus, or instances of certain phrasal structures in a Treebank. Assuming no prior programming background, the book provides numerous example programs that search for phonological, morphological and syntactic constructions in corpora, and the associated web site provides sample data and programs, which make it easy to start working independently. This book is a valuable resource for linguists who use corpus methods but have no programming training.