Applying Language Technology in Humanities Research

Applying Language Technology in Humanities Research
Author: Barbara McGillivray
Publisher: Springer Nature
Total Pages: 133
Release: 2020-07-13
Genre: Language Arts & Disciplines
ISBN: 3030464938

This book presents established and state-of-the-art methods in Language Technology (including text mining, corpus linguistics, computational linguistics, and natural language processing), and demonstrates how they can be applied by humanities scholars working with textual data. The landscape of humanities research has recently changed thanks to the proliferation of big data and large textual collections such as Google Books, Early English Books Online, and Project Gutenberg. These resources have yet to be fully explored by new generations of scholars, and the authors argue that Language Technology has a key role to play in the exploration of large-scale textual data. The authors use a series of illustrative examples from various humanistic disciplines (mainly but not exclusively from History, Classics, and Literary Studies) to demonstrate basic and more complex use-case scenarios. This book will be useful to graduate students and researchers in humanistic disciplines working with textual data, including History, Modern Languages, Literary studies, Classics, and Linguistics. This is also a very useful book for anyone teaching or learning Digital Humanities and interested in the basic concepts from computational linguistics, corpus linguistics, and natural language processing.

The Routledge Handbook of Corpus Linguistics

The Routledge Handbook of Corpus Linguistics
Author: Anne O'Keeffe
Publisher: Routledge
Total Pages: 1263
Release: 2010-04-05
Genre: Education
ISBN: 1135153620

The Routledge Handbook of Corpus Linguistics provides a timely overview of a dynamic and rapidly growing area with a widely applied methodology. Through the electronic analysis of large bodies of text, corpus linguistics demonstrates and supports linguistic statements and assumptions. In recent years it has seen an ever-widening application in a variety of fields: computational linguistics, discourse analysis, forensic linguistics, pragmatics and translation studies. Bringing together experts in the key areas of development and change, the handbook is structured around six themes which take the reader through building and designing a corpus to using a corpus to study literature and translation. A comprehensive introduction covers the historical development of the field and its growing influence and application in other areas. Structured around five headings for ease of reference, each contribution includes further reading sections with three to five key texts highlighted and annotated to facilitate further exploration of the topics. The Routledge Handbook of Corpus Linguistics is the ideal resource for advanced undergraduates and postgraduates.

Quantitative Corpus Linguistics with R

Quantitative Corpus Linguistics with R
Author: Stefan Th. Gries
Publisher: Routledge
Total Pages: 257
Release: 2009-03-04
Genre: Education
ISBN: 1135895600

The first textbook of its kind, Quantitative Corpus Linguistics with R demonstrates how to use the open source programming language R for corpus linguistic analyses. Computational and corpus linguists doing corpus work will find that R provides an enormous range of functions that currently require several programs to achieve – searching and processing corpora, arranging and outputting the results of corpus searches, statistical evaluation, and graphing.

Language Corpora Annotation and Processing

Language Corpora Annotation and Processing
Author: Niladri Sekhar Dash
Publisher: Springer Nature
Total Pages:
Release: 2021
Genre: Computational linguistics
ISBN: 9811629609

This book addresses the research, analysis, and description of the methods and processes that are used in the annotation and processing of language corpora in advanced, semi-advanced, and non-advanced languages. It provides the background information and empirical data needed to understand the nature and depth of problems related to corpus annotation and text processing and shows readers how the linguistic elements found in texts are analyzed and applied to develop language technology systems and devices. As such, it offers valuable insights for researchers, educators, and students of linguistics and language technology.

Developing Linguistic Corpora

Developing Linguistic Corpora
Author: Martin Wynne
Publisher: Oxbow Books Limited
Total Pages: 100
Release: 2005
Genre: Language Arts & Disciplines
ISBN:

A linguistic corpus is a collection of texts which have been selected and brought together so that language can be studied on the computer. Today, corpus linguistics offers some of the most powerful new procedures for the analysis of language, and the impact of this dynamic and expanding sub-discipline is making itself felt in many areas of language study. In this volume, a selection of leading experts in various key areas of corpus construction offer advice in a readable and largely non-technical style to help the reader to ensure that their corpus is well designed and fit for the intended purpose. This guide is aimed at those who are at some stage of building a linguistic corpus. Little or no knowledge of corpus linguistics or computational procedures is assumed, although it is hoped that more advanced users will find the guidelines here useful. It is also aimed at those who are not building a corpus, but who need to know something about the issues involved in the design of corpora in order to choose between available resources and to help draw conclusions from their studies.

History, Features, and Typology of Language Corpora

History, Features, and Typology of Language Corpora
Author: Niladri Sekhar Dash
Publisher: Springer
Total Pages: 311
Release: 2018-02-01
Genre: Language Arts & Disciplines
ISBN: 9811074585

This book discusses key issues of corpus linguistics like the definition of the corpus, primary features of a corpus, and utilization and limitations of corpora. It presents a unique classification scheme of language corpora to show how they can be studied from the perspective of genre, nature, text type, purpose, and application. A reference to parallel translation corpus is mandatory in the discussion of corpus generation, which the authors thoroughly address here, with a focus on Indian language corpora and English. Web-text corpus, a new development in corpus linguistics, is also discussed with elaborate reference to Indian web text corpora. The book also presents a short history of corpus generation and provides scenarios before and after the advent of computer-generated digital corpora. This book has several important features: it discusses many technical issues of the field in a lucid manner; contains extensive new diagrams and charts for easy comprehension; and presents discussions in simplified English to cater to the needs of non-native English readers. This is an important resource authored by academics who have many years of experience teaching and researching corpus linguistics. Its focus on Indian languages and on English corpora makes it applicable to students of graduate and postgraduate courses in applied linguistics, computational linguistics and language processing in South Asia and across countries where English is spoken as a first or second language.

Corpus Linguistics: An Introduction

Corpus Linguistics: An Introduction
Author: Dash, Niladri Sekhar
Publisher: Pearson Education India
Total Pages: 208
Release: 2008
Genre:
ISBN: 8131752623

Corpus Linguistics: An Introduction will appeal to a wide spectrum of scholars, researchers, and particularly to students of linguistics. It offers guidelines for the creation and usage of corpora in the form of empirical language databases with direct functional and theoretical interpretation of a natural language. Drawn from original research and written in an accessible language and style, this book will create avenues for further advancements in mainstream and applied linguistics and language technology.

Corpus Linguistics for Education

Corpus Linguistics for Education
Author: Pascual Pérez-Paredes
Publisher: Routledge
Total Pages: 179
Release: 2020-07-30
Genre: Education
ISBN: 0429516762

Corpus Linguistics for Education provides a practical and comprehensive introduction to the use of corpus research-methods in the field of education. Taking a hands-on approach to showcase the applications of corpora in the exploration of educationally relevant topics, this book: • covers 18 key skills including corpus building, the role of frequency, different corpus methods, transcription and annotation; • demonstrates the use of available corpora and desktop and online corpus analysis tools to conduct original analyses; • features case studies and step-by-step guides within each chapter; • emphasises the use of interview data in research projects. Corpus Linguistics for Education is an essential guide for students and researchers studying or conducting their own corpus-based research in education.

Corpus Linguistics in Literary Analysis

Corpus Linguistics in Literary Analysis
Author: Bettina Fischer-Starcke
Publisher: Bloomsbury Publishing
Total Pages: 241
Release: 2010-08-26
Genre: Language Arts & Disciplines
ISBN: 1441158839

Corpus Linguistics and The Study of Literature provides a theoretical introduction to corpus stylistics and also demonstrates its application by presenting corpus stylistic analyses of literary texts and corpora. The first part of the book addresses theoretical issues such as the relationship between subjectivity and objectivity in corpus linguistic analyses, criteria for the evaluation of results from corpus linguistic analyses and also discusses units of meaning in language. The second part of the book takes this theory and applies it to Northanger Abbey by Jane Austen and to two corpora consisting of: Austen's six novels; and texts that are contemporary with Austen. The analyses demonstrate the impact of various features of text on literary meanings and how corpus tools can extract new critical angles. This book will be a key read for upper level undergraduates and postgraduates working in corpus linguistics and in stylistics on linguistics and language studies courses. The editorial board includes: Paul Baker (Lancaster), Frantisek Cermak (Prague), Susan Conrad (Portland), Geoffrey Leech (Lancaster), Dominique Maingueneau (Paris XII), Christian Mair (Freiburg), Alan Partington (Bologna), Elena Tognini-Bonelli (Siena and TWC), Ruth Wodak (Lancaster), and Feng Zhiwei (Beijing). The Corpus and Discourse series consists of two strands. The first, Research in Corpus and Discourse , features innovative contributions to various aspects of corpus linguistics and a wide range of applications, from language technology via the teaching of a second language to a history of mentalities. The second strand, Studies in Corpus and Discourse , is comprised of key texts bridging the gap between social studies and linguistics. Although equally academically rigorous, this strand will be aimed at a wider audience of academics and postgraduate students working in both disciplines.