Efficient Parsing for Natural Language

Efficient Parsing for Natural Language
Author: Masaru Tomita
Publisher: Springer Science & Business Media
Total Pages: 209
Release: 2013-04-17
Genre: Computers
ISBN: 1475718853

Parsing Efficiency is crucial when building practical natural language systems. 'Ibis is especially the case for interactive systems such as natural language database access, interfaces to expert systems and interactive machine translation. Despite its importance, parsing efficiency has received little attention in the area of natural language processing. In the areas of compiler design and theoretical computer science, on the other hand, parsing algorithms 3 have been evaluated primarily in terms of the theoretical worst case analysis (e.g. lXn», and very few practical comparisons have been made. This book introduces a context-free parsing algorithm that parses natural language more efficiently than any other existing parsing algorithms in practice. Its feasibility for use in practical systems is being proven in its application to Japanese language interface at Carnegie Group Inc., and to the continuous speech recognition project at Carnegie-Mellon University. This work was done while I was pursuing a Ph.D degree at Carnegie-Mellon University. My advisers, Herb Simon and Jaime Carbonell, deserve many thanks for their unfailing support, advice and encouragement during my graduate studies. I would like to thank Phil Hayes and Ralph Grishman for their helpful comments and criticism that in many ways improved the quality of this book. I wish also to thank Steven Brooks for insightful comments on theoretical aspects of the book (chapter 4, appendices A, B and C), and Rich Thomason for improving the linguistic part of tile book (the very beginning of section 1.1).

Inductive Dependency Parsing

Inductive Dependency Parsing
Author: Joakim Nivre
Publisher: Springer Science & Business Media
Total Pages: 224
Release: 2006-08-05
Genre: Computers
ISBN: 1402048890

This book describes the framework of inductive dependency parsing, a methodology for robust and efficient syntactic analysis of unrestricted natural language text. Coverage includes a theoretical analysis of central models and algorithms, and an empirical evaluation of memory-based dependency parsing using data from Swedish and English. A one-stop reference to dependency-based parsing of natural language, it will interest researchers and system developers in language technology, and is suitable for graduate or advanced undergraduate courses.

Generalized LR Parsing

Generalized LR Parsing
Author: Masaru Tomita
Publisher: Springer Science & Business Media
Total Pages: 194
Release: 1991-08-31
Genre: Computers
ISBN: 9780792392019

The Generalized LR parsing algorithm (some call it "Tomita's algorithm") was originally developed in 1985 as a part of my Ph.D thesis at Carnegie Mellon University. When I was a graduate student at CMU, I tried to build a couple of natural language systems based on existing parsing methods. Their parsing speed, however, always bothered me. I sometimes wondered whether it was ever possible to build a natural language parser that could parse reasonably long sentences in a reasonable time without help from large mainframe machines. At the same time, I was always amazed by the speed of programming language compilers, because they can parse very long sentences (i.e., programs) very quickly even on workstations. There are two reasons. First, programming languages are considerably simpler than natural languages. And secondly, they have very efficient parsing methods, most notably LR. The LR parsing algorithm first precompiles a grammar into an LR parsing table, and at the actual parsing time, it performs shift-reduce parsing guided deterministically by the parsing table. So, the key to the LR efficiency is the grammar precompilation; something that had never been tried for natural languages in 1985. Of course, there was a good reason why LR had never been applied for natural languages; it was simply impossible. If your context-free grammar is sufficiently more complex than programming languages, its LR parsing table will have multiple actions, and deterministic parsing will be no longer possible.

Natural Language Processing with Python

Natural Language Processing with Python
Author: Steven Bird
Publisher: "O'Reilly Media, Inc."
Total Pages: 506
Release: 2009-06-12
Genre: Computers
ISBN: 0596555717

This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication. Packed with examples and exercises, Natural Language Processing with Python will help you: Extract information from unstructured text, either to guess the topic or identify "named entities" Analyze linguistic structure in text, including parsing and semantic analysis Access popular linguistic databases, including WordNet and treebanks Integrate techniques drawn from fields as diverse as linguistics and artificial intelligence This book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful.

Memory-Based Language Processing

Memory-Based Language Processing
Author: Walter Daelemans
Publisher: Cambridge University Press
Total Pages: 199
Release: 2005-09-01
Genre: Language Arts & Disciplines
ISBN: 1139445367

Memory-based language processing - a machine learning and problem solving method for language technology - is based on the idea that the direct reuse of examples using analogical reasoning is more suited for solving language processing problems than the application of rules extracted from those examples. This book discusses the theory and practice of memory-based language processing, showing its comparative strengths over alternative methods of language modelling. Language is complex, with few generalizations, many sub-regularities and exceptions, and the advantage of memory-based language processing is that it does not abstract away from this valuable low-frequency information. By applying the model to a range of benchmark problems, the authors show that for linguistic areas ranging from phonology to semantics, it produces excellent results. They also describe TiMBL, a software package for memory-based language processing. The first comprehensive overview of the approach, this book will be invaluable for computational linguists, psycholinguists and language engineers.

Parsing Techniques

Parsing Techniques
Author: Dick Grune
Publisher: Springer Science & Business Media
Total Pages: 677
Release: 2007-10-29
Genre: Computers
ISBN: 0387689540

This second edition of Grune and Jacobs’ brilliant work presents new developments and discoveries that have been made in the field. Parsing, also referred to as syntax analysis, has been and continues to be an essential part of computer science and linguistics. Parsing techniques have grown considerably in importance, both in computer science, ie. advanced compilers often use general CF parsers, and computational linguistics where such parsers are the only option. They are used in a variety of software products including Web browsers, interpreters in computer devices, and data compression programs; and they are used extensively in linguistics.

A Computational Model of Natural Language Communication

A Computational Model of Natural Language Communication
Author: Roland R. Hausser
Publisher: Springer Science & Business Media
Total Pages: 365
Release: 2006-09-28
Genre: Computers
ISBN: 3540354778

The ideal of using human language to control machines requires a practical theory of natural language communication that includes grammatical analysis of language signs, plus a model of the cognitive agent, with interfaces for recognition and action, an internal database, and an algorithm for reading content in and out. This book offers a functional framework for theoretical analysis of natural language communication and for practical applications of natural language processing.

Practical Aspects of Declarative Languages

Practical Aspects of Declarative Languages
Author: Manuel Carro
Publisher: Springer Science & Business Media
Total Pages: 307
Release: 2010-01-12
Genre: Computers
ISBN: 3642115020

This book constitutes the refereed proceedings of the 12th International Symposium on Practical Aspects of Declarative Languages, PADL 2010, held in Madrid, Spain, in January 2010, colocated with POPL 2010, the Symposium on Principles of Programming Languages. The 22 revised full papers presented together with 2 invited talks were carefully reviewed and selected from 58 submissions. The volume features original work emphasizing novel applications and implementation techniques for all forms of clarative concepts, including functions, relations, logic, and constraints. The papers address all current aspects of declarative programming; they are organized in topical sections on non-monotonic reasoning - answer set programming, types, parallelism and distribution, code quality assurance, domain specific languages, programming aids, constraints, and tabling - agents.