Attack of the Tagger

Attack of the Tagger
Author: Wendelin Van Draanen
Publisher: Alfred A. Knopf
Total Pages: 175
Release: 2004
Genre: Juvenile Fiction
ISBN: 0375823522

Someone is spray-painting graffiti all over Cedar Valley and it is up to fifth-grader Nolan Byrd, also known as Shredderman, to expose the vandal.

Crosslingual Implementation of Linguistic Taggers Using Parallel Corpora

Crosslingual Implementation of Linguistic Taggers Using Parallel Corpora
Author: Hani Safadi
Publisher: Lulu.com
Total Pages: 74
Release: 2010-04-27
Genre: Computers
ISBN: 0557448093

This book addresses the problem of creating linguistic taggers for resource-poor languages using existing taggers in resource rich languages. Linguistic taggers are classifiers that map individual words or phrases from a sentence to a set of tags. Linguistic taggers are usually trained using supervised learning algorithms.The proposed approach does not require that the input sentence be translated into the source language. Instead, projection of linguistic tags is accomplished through the use of a parallel corpus, which is a collection of texts that are available in a source language and a target language. The correspondence between words of the source and target language allows to project tags from source to target language words.A parallel corpus of the source and target languages might not be readily available for many language pairs. To deal with this problem, we describe a system for automatic acquisition of aligned, bilingual corpora from pre-specified domains on the World Wide Web.

Introduction to Linguistic Annotation and Text Analytics

Introduction to Linguistic Annotation and Text Analytics
Author: Graham Wilcock
Publisher: Springer Nature
Total Pages: 151
Release: 2022-05-31
Genre: Computers
ISBN: 3031021320

Linguistic annotation and text analytics are active areas of research and development, with academic conferences and industry events such as the Linguistic Annotation Workshops and the annual Text Analytics Summits. This book provides a basic introduction to both fields, and aims to show that good linguistic annotations are the essential foundation for good text analytics. After briefly reviewing the basics of XML, with practical exercises illustrating in-line and stand-off annotations, a chapter is devoted to explaining the different levels of linguistic annotations. The reader is encouraged to create example annotations using the WordFreak linguistic annotation tool. The next chapter shows how annotations can be created automatically using statistical NLP tools, and compares two sets of tools, the OpenNLP and Stanford NLP tools. The second half of the book describes different annotation formats and gives practical examples of how to interchange annotations between different formats using XSLT transformations. The two main text analytics architectures, GATE and UIMA, are then described and compared, with practical exercises showing how to configure and customize them. The final chapter is an introduction to text analytics, describing the main applications and functions including named entity recognition, coreference resolution and information extraction, with practical examples using both open source and commercial tools. Copies of the example files, scripts, and stylesheets used in the book are available from the companion website, located at the book website. Table of Contents: Working with XML / Linguistic Annotation / Using Statistical NLP Tools / Annotation Interchange / Annotation Architectures / Text Analytics

Text Analytics for Corpus Linguistics and Digital Humanities

Text Analytics for Corpus Linguistics and Digital Humanities
Author: Gerold Schneider
Publisher: Bloomsbury Publishing
Total Pages: 241
Release: 2024-05-02
Genre: Language Arts & Disciplines
ISBN: 1350370843

Do you want to gain a deeper understanding of how big tech analyses and exploits our text data, or investigate how political parties differ by analysing textual styles, associations and trends in documents? Or create a map of a text collection and write a simple QA system yourself? This book explores how to apply state-of-the-art text analytics methods to detect and visualise phenomena in text data. Solidly based on methods from corpus linguistics, natural language processing, text analytics and digital humanities, this book shows readers how to conduct experiments with their own corpora and research questions, underpin their theories, quantify the differences and pinpoint characteristics. Case studies and experiments are detailed in every chapter using real-world and open access corpora from politics, World English, history, and literature. The results are interpreted and put into perspective, pitfalls are pointed out, and necessary pre-processing steps are demonstrated. This book also demonstrates how to use the programming language R, as well as simple alternatives and additions to R, to conduct experiments and employ visualisations by example, with extensible R-code, recipes, links to corpora, and a wide range of methods. The methods introduced can be used across texts of all disciplines, from history or literature to party manifestos and patient reports.

Adaptive and Natural Computing Algorithms

Adaptive and Natural Computing Algorithms
Author: Marco Tomassini
Publisher: Springer
Total Pages: 518
Release: 2013-04-12
Genre: Computers
ISBN: 3642372139

The book constitutes the refereed proceedings of the 11th International Conference on Adaptive and Natural Computing Algorithms, ICANNGA 2013, held in Lausanne, Switzerland, in April 2013. The 51 revised full papers presented were carefully reviewed and selected from a total of 91 submissions. The papers are organized in topical sections on neural networks, evolutionary computation, soft computing, bioinformatics and computational biology, advanced computing, and applications.

Crowdsourcing our Cultural Heritage

Crowdsourcing our Cultural Heritage
Author: Ms Mia Ridge
Publisher: Ashgate Publishing, Ltd.
Total Pages: 313
Release: 2014-10-28
Genre: Language Arts & Disciplines
ISBN: 147241022X

Crowdsourcing, or asking the general public to help contribute to shared goals, is increasingly popular in memory institutions as a tool for digitising or computing vast amounts of data. This book brings together for the first time the collected wisdom of international leaders in the theory and practice of crowdsourcing in cultural heritage. It features eight accessible case studies of groundbreaking projects from leading cultural heritage and academic institutions, and four thought-provoking essays that reflect on the wider implications of this engagement for participants and on the institutions themselves. This book will be essential reading for information and cultural management professionals, students and researchers in universities, corporate, public or academic libraries, museums and archives.

Python 3 Text Processing with NLTK 3 Cookbook

Python 3 Text Processing with NLTK 3 Cookbook
Author: Jacob Perkins
Publisher: Packt Publishing Ltd
Total Pages: 530
Release: 2014-08-26
Genre: Computers
ISBN: 1782167862

This book is intended for Python programmers interested in learning how to do natural language processing. Maybe you’ve learned the limits of regular expressions the hard way, or you’ve realized that human language cannot be deterministically parsed like a computer language. Perhaps you have more text than you know what to do with, and need automated ways to analyze and structure that text. This Cookbook will show you how to train and use statistical language models to process text in ways that are practically impossible with standard programming tools. A basic knowledge of Python and the basic text processing concepts is expected. Some experience with regular expressions will also be helpful.

Information Retrieval

Information Retrieval
Author: Pavel Braslavski
Publisher: Springer
Total Pages: 370
Release: 2015-12-09
Genre: Computers
ISBN: 3319254855

This book constitutes the thoroughly refereed proceedings of the 8th Russian Summer School on Information Retrieval, RuSSIR 2014, held in Nizhniy Novgorod, Russia, in August 2014. The volume includes 6 tutorial papers, summarizing lectures given at the event, and 8 revised papers from the school participants.The papers focus on various aspects of information retrieval.