Linguistic Corpora and Big Data in Spanish and Portuguese
Author | : Miguel Calderón Campos, Gael Vaamonde |
Publisher | : Walter de Gruyter GmbH & Co KG |
Total Pages | : 260 |
Release | : 2024-06-29 |
Genre | : |
ISBN | : 3110781522 |
Download Linguistic Corpora And Big Data In Spanish And Portuguese full books in PDF, epub, and Kindle. Read online free Linguistic Corpora And Big Data In Spanish And Portuguese ebook anywhere anytime directly on your device. Fast Download speed and no annoying ads. We cannot guarantee that every ebooks is available!
Author | : Miguel Calderón Campos, Gael Vaamonde |
Publisher | : Walter de Gruyter GmbH & Co KG |
Total Pages | : 260 |
Release | : 2024-06-29 |
Genre | : |
ISBN | : 3110781522 |
Author | : Miguel Calderón Campos |
Publisher | : Walter de Gruyter GmbH & Co KG |
Total Pages | : 238 |
Release | : 2024-10-21 |
Genre | : Language Arts & Disciplines |
ISBN | : 3110781468 |
In recent decades, corpus linguistics has experienced tremendous development in the Hispanic world, along two opposite but complementary approaches: increase in corpus size (corpus linguistics as Big Data) and improvement in document selection and data annotation (corpus linguistics as High Quality Data). The first approach has led to the creation of massive corpora such as EsTenTen; at the same time, it has promoted the use of the web and social networks as corpora. The second perspective gives rise to specialized corpora such as Post Scriptum or Oralia Diacrónica del español (ODE). The contributions gathered in this volume combine both methods in order to exploit their advantages and to overcome their possible limitations. On the one hand, it addresses the creation and design of small corpora focused on data quality; on the other hand, it offers case studies that make use of both specialized corpora and massive data extracted from the web. Highlighting the complementary nature of both methods is the main idea of this book.
Author | : Miguel Calderón Campos |
Publisher | : |
Total Pages | : 0 |
Release | : 2024 |
Genre | : |
ISBN | : 9783110781458 |
In recent decades, corpus linguistics has experienced tremendous development in the Hispanic world, along two opposite but complementary approaches: increase in corpus size (corpus linguistics as Big Data) and improvement in document selection and data annotation (corpus linguistics as High Quality Data). The first approach has led to the creation of massive corpora such as EsTenTen; at the same time, it has promoted the use of the web and social networks as corpora. The second perspective gives rise to specialized corpora such as Post Scriptum or Oralia Diacrónica del español (ODE). The contributions gathered in this volume combine both methods in order to exploit their advantages and to overcome their possible limitations. On the one hand, it addresses the creation and design of small corpora focused on data quality; on the other hand, it offers case studies that make use of both specialized corpora and massive data extracted from the web. Highlighting the complementary nature of both methods is the main idea of this book.
Author | : Allison Burkette |
Publisher | : |
Total Pages | : 253 |
Release | : 2018-03-15 |
Genre | : Language Arts & Disciplines |
ISBN | : 1108424805 |
Introduces students to the scientific study of language, using the basic principles of complexity theory.
Author | : Juan Antonio Lossio-Ventura |
Publisher | : Springer |
Total Pages | : 400 |
Release | : 2019-02-07 |
Genre | : Computers |
ISBN | : 3030116808 |
This book constitutes the refereed proceedings of the 5th International Conference on Information Management and Big Data, SIMBig 2018, held in Lima, Peru, in September 2018. The 34 papers presented were carefully reviewed and selected from 101 submissions. The papers address issues such as data mining, artificial intelligence, Natural Language Processing, information retrieval, machine learning, web mining.
Author | : J. Dinesh Peter |
Publisher | : Springer |
Total Pages | : 575 |
Release | : 2018-12-12 |
Genre | : Technology & Engineering |
ISBN | : 9811318824 |
This book is a compendium of the proceedings of the International Conference on Big Data and Cloud Computing. It includes recent advances in the areas of big data analytics, cloud computing, internet of nano things, cloud security, data analytics in the cloud, smart cities and grids, etc. This volume primarily focuses on the application of the knowledge that promotes ideas for solving the problems of the society through cutting-edge technologies. The articles featured in this proceeding provide novel ideas that contribute to the growth of world class research and development. The contents of this volume will be of interest to researchers and professionals alike.
Author | : A. Joaquim da Silva Teixeira |
Publisher | : Springer |
Total Pages | : 290 |
Release | : 2008-09-08 |
Genre | : Language Arts & Disciplines |
ISBN | : 3540859802 |
This book constitutes the thoroughly refereed proceedings of the 8th International Workshop on Computational Processing of the Portuguese Language, PROPOR 2008, held in Aveiro, Portugal, in September 2008. The 21 revised full papers and 16 revised short papers presented were carefully reviewed and selected from 63 submissions. The papers are organized in topical sections on speech analysis; ontologies, semantics and anaphora resolution; speech synthesis; machine learning applied to natural language processing; speech recognition and applications; natural language processing tools and applications; posters.
Author | : Ana Gallego Cuiñas, Daniel Torres-Salinas |
Publisher | : Walter de Gruyter GmbH & Co KG |
Total Pages | : 218 |
Release | : 2023-10-12 |
Genre | : |
ISBN | : 3110753618 |
Author | : Anke Lüdeling |
Publisher | : Walter de Gruyter |
Total Pages | : 797 |
Release | : 2008-12-10 |
Genre | : Language Arts & Disciplines |
ISBN | : 3110211424 |
This volume provides an up-to-date survey of the field of corpus linguistics, a field whose methodology has revolutionized much of the empirical work done in most fields of linguistic study over the past decade. Corpus linguistics investigates human language by starting out from large collections of texts - spoken, written, or recorded. These language corpora, which are now regularly available in electronic form, are the basis for quantitative and qualitative research on almost any question of linguistic interest. Many techniques that are in use in corpus linguistics today are rooted in the tradition of the late 18th and 19th century, when linguistics began to make use of mathematical and empirical methods. Modern corpus linguistics has used and developed these methods in close connection with computer science and computational linguistics. The handbook sketches the history of corpus linguistics, shows its potential, discusses its problems, and describes various methods of collecting, annotating, and searching corpora as well as processing corpus data. It also reports case studies that illustrate the wide range of linguistic research questions addressed in corpus linguistics. The over 60 articles included in the handbook are divided into five sections: (1) the origins and history of corpus linguistics and surveys of its relationship to central fields of linguistics (2) corpus compilation (3) corpus types (4) preprocessing of corpora (5) the use and exploitation of corpora. The final section gives an overview of the results of corpus studies obtained in phonetics, phonology, morphology, syntax, semantics, sociolinguistics, historical linguistics, stylometry, dialectology, and discourse analysis. It also reports on recent advances made in human and machine translation, contrastive studies, computer-assisted language learning, and automatic summarization. The contributors to the volume are internationally known experts in their respective fields. The handbook is intended for a wide audience ranging from teachers, university students, and scholars to anyone interested in the use of computers in linguistic analyses and applications.
Author | : Christoph Gabriel |
Publisher | : Walter de Gruyter GmbH & Co KG |
Total Pages | : 735 |
Release | : 2021-11-22 |
Genre | : Language Arts & Disciplines |
ISBN | : 3110548674 |
This handbook is structured in two parts: it provides, on the one hand, a comprehensive (synchronic) overview of the phonetics and phonology (including prosody) of a breadth of Romance languages and focuses, on the other hand, on central topics of research in Romance segmental and suprasegmental phonology, including comparative and diachronic perspectives. Phonetics and phonology have always been a core discipline in Romance linguistics: the wide synchronic variety of languages and dialects derived from spoken Latin is extensively explored in numerous corpus and atlas projects, and for quite a few of these varieties there is also more or less ample documentation of at least some of their diachronic stages. This rich empirical database offers excellent testing grounds for different theoretical approaches and allows for substantial insights into phonological structuring as well as into (incipient, ongoing, or concluded) processes of phonological change. The volume can be read both as a state-of-the-art report of research in the field and as a manual of Romance languages with special emphasis on the key topics of phonetics and phonology.