Cross-Language Information Retrieval

Cross-Language Information Retrieval
Author: Gregory Grefenstette
Publisher: Springer Science & Business Media
Total Pages: 190
Release: 2012-12-06
Genre: Computers
ISBN: 1461556619

Most of the papers in this volume were first presented at the Workshop on Cross-Linguistic Information Retrieval that was held August 22, 1996 dur ing the SIGIR'96 Conference. Alan Smeaton of Dublin University and Paraic Sheridan of the ETH, Zurich, were the two other members of the Scientific Committee for this workshop. SIGIR is the Association for Computing Ma chinery (ACM) Special Interest Group on Information Retrieval, and they have held conferences yearly since 1977. Three additional papers have been added: Chapter 4 Distributed Cross-Lingual Information retrieval describes the EMIR retrieval system, one of the first general cross-language systems to be implemented and evaluated; Chapter 6 Mapping Vocabularies Using Latent Semantic Indexing, which originally appeared as a technical report in the Lab oratory for Computational Linguistics at Carnegie Mellon University in 1991, is included here because it was one of the earliest, though hard-to-find, publi cations showing the application of Latent Semantic Indexing to the problem of cross-language retrieval; and Chapter 10 A Weighted Boolean Model for Cross Language Text Retrieval describes a recent approach to solving the translation term weighting problem, specific to Cross-Language Information Retrieval. Gregory Grefenstette CONTRIBUTORS Lisa Ballesteros David Hull W, Bruce Croft Gregory Grefenstette Center for Intelligent Xerox Research Centre Europe Information Retrieval Grenoble Laboratory Computer Science Department University of Massachusetts Thomas K. Landauer Department of Psychology Mark W. Davis and Institute of Cognitive Science Computing Research Lab University of Colorado, Boulder New Mexico State University Michael L. Littman Bonnie J.

Cross-Language Information Retrieval

Cross-Language Information Retrieval
Author: Jian-Yun Nie
Publisher: Springer Nature
Total Pages: 125
Release: 2022-05-31
Genre: Computers
ISBN: 303102138X

Search for information is no longer exclusively limited within the native language of the user, but is more and more extended to other languages. This gives rise to the problem of cross-language information retrieval (CLIR), whose goal is to find relevant information written in a different language to a query. In addition to the problems of monolingual information retrieval (IR), translation is the key problem in CLIR: one should translate either the query or the documents from a language to another. However, this translation problem is not identical to full-text machine translation (MT): the goal is not to produce a human-readable translation, but a translation suitable for finding relevant documents. Specific translation methods are thus required. The goal of this book is to provide a comprehensive description of the specific problems arising in CLIR, the solutions proposed in this area, as well as the remaining problems. The book starts with a general description of the monolingual IR and CLIR problems. Different classes of approaches to translation are then presented: approaches using an MT system, dictionary-based translation and approaches based on parallel and comparable corpora. In addition, the typical retrieval effectiveness using different approaches is compared. It will be shown that translation approaches specifically designed for CLIR can rival and outperform high-quality MT systems. Finally, the book offers a look into the future that draws a strong parallel between query expansion in monolingual IR and query translation in CLIR, suggesting that many approaches developed in monolingual IR can be adapted to CLIR. The book can be used as an introduction to CLIR. Advanced readers can also find more technical details and discussions about the remaining research challenges in the future. It is suitable to new researchers who intend to carry out research on CLIR. Table of Contents: Preface / Introduction / Using Manually Constructed Translation Systems and Resources for CLIR / Translation Based on Parallel and Comparable Corpora / Other Methods to Improve CLIR / A Look into the Future: Toward a Unified View of Monolingual IR and CLIR? / References / Author Biography

Multilingual Information Retrieval

Multilingual Information Retrieval
Author: Carol Peters
Publisher: Springer Science & Business Media
Total Pages: 232
Release: 2012-01-05
Genre: Computers
ISBN: 3642230075

We are living in a multilingual world and the diversity in languages which are used to interact with information access systems has generated a wide variety of challenges to be addressed by computer and information scientists. The growing amount of non-English information accessible globally and the increased worldwide exposure of enterprises also necessitates the adaptation of Information Retrieval (IR) methods to new, multilingual settings. Peters, Braschler and Clough present a comprehensive description of the technologies involved in designing and developing systems for Multilingual Information Retrieval (MLIR). They provide readers with broad coverage of the various issues involved in creating systems to make accessible digitally stored materials regardless of the language(s) they are written in. Details on Cross-Language Information Retrieval (CLIR) are also covered that help readers to understand how to develop retrieval systems that cross language boundaries. Their work is divided into six chapters and accompanies the reader step-by-step through the various stages involved in building, using and evaluating MLIR systems. The book concludes with some examples of recent applications that utilise MLIR technologies. Some of the techniques described have recently started to appear in commercial search systems, while others have the potential to be part of future incarnations. The book is intended for graduate students, scholars, and practitioners with a basic understanding of classical text retrieval methods. It offers guidelines and information on all aspects that need to be taken into consideration when building MLIR systems, while avoiding too many ‘hands-on details’ that could rapidly become obsolete. Thus it bridges the gap between the material covered by most of the classical IR textbooks and the novel requirements related to the acquisition and dissemination of information in whatever language it is stored.

Multilingual Information Access Evaluation I - Text Retrieval Experiments

Multilingual Information Access Evaluation I - Text Retrieval Experiments
Author: Carol Peters
Publisher: Springer Science & Business Media
Total Pages: 701
Release: 2010-09-13
Genre: Computers
ISBN: 364215753X

This book constitutes the thoroughly refereed proceedings of the 10th Workshop of the Cross Language Evaluation Forum, CLEF 2010, held in Corfu, Greece, in September/October 2009. The volume reports experiments on various types of textual document collections. It is divided into six main sections presenting the results of the following tracks: Multilingual Document Retrieval (Ad-Hoc), Multiple Language Question Answering (QA@CLEF), Multilingual Information Filtering (INFILE@CLEF), Intellectual Property (CLEF-IP) and Log File Analysis (LogCLEF), plus the activities of the MorphoChallenge Program.

Advances in Multilingual and Multimodal Information Retrieval

Advances in Multilingual and Multimodal Information Retrieval
Author: Cross-Language Evaluation Forum. Workshop
Publisher: Springer Science & Business Media
Total Pages: 942
Release: 2008-09-10
Genre: Computers
ISBN: 3540857591

This book constitutes the thoroughly refereed proceedings of the 8th Workshop of the Cross-Language Evaluation Forum, CLEF 2007, held in Budapest, Hungary, September 2007. The revised and extended papers were carefully reviewed and selected for inclusion in the book. There are 115 contributions in total and an introduction. The seven distrinct evaluation tracks in CLEF 2007, are designed to test the performance of a wide range of multilingual information access systems or system components. The papers are organized in topical sections on Multilingual Textual Document Retrieval (Ad Hoc), Domain-Specific Information Retrieval (Domain-Specific), Multiple Language Question Answering (QA@CLEF), cross-language retrieval in image collections (Image CLEF), cross-language speech retrieval (CL-SR), multilingual Web retrieval (WebCLEF), cross-language geographical retrieval (GeoCLEF), and CLEF in other evaluations.

Accessing Multilingual Information Repositories

Accessing Multilingual Information Repositories
Author: Fredric Gey
Publisher: Springer
Total Pages: 1032
Release: 2006-10-15
Genre: Computers
ISBN: 3540457003

This book constitutes the thoroughly refereed postproceedings of the 6th Workshop of the Cross-Language Evaluation Forum, CLEF 2005. The book presents 111 revised papers together with an introduction. Topical sections include multilingual textual document retrieval, cross-language and more, monolingual experiments, domain-specific information retrieval, interactive cross-language information retrieval, multiple language question answering, cross-language retrieval in image collections, cross-language speech retrieval, multilingual Web track, cross-language geographical retrieval, and evaluation issues.

Multilingual Information Access for Text, Speech and Images

Multilingual Information Access for Text, Speech and Images
Author: Cross-Language Evaluation Forum. Workshop
Publisher: Springer Science & Business Media
Total Pages: 860
Release: 2005-07-20
Genre: Computers
ISBN: 3540274200

This book constitutes the thoroughly refereed postproceedings of the 5th Workshop of the Cross-Language Evaluation Forum, CLEF 2004, held in Bath, UK in September 2004. The 80 revised papers presented together with an introduction were carefully reviewed and selected for inclusion in the book. The papers are organized in topical sections on ad hoc text retrieval tracks (mainly cross-language experiments and monolingual experiments), domain-specific document retrieval, interactive cross-language information retrieval, multiple language question answering, cross-language retrieval in image collections, cross-language spoken document retrieval, and on issues in CLIR and in evaluation.

Evaluation of Multilingual and Multi-modal Information Retrieval

Evaluation of Multilingual and Multi-modal Information Retrieval
Author: Cross-Language Evaluation Forum. Workshop
Publisher: Springer Science & Business Media
Total Pages: 1018
Release: 2007-09-06
Genre: Computers
ISBN: 3540749985

This book constitutes the thoroughly refereed postproceedings of the 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006, held in Alicante, Spain, September 2006. The revised papers presented together with an introduction were carefully reviewed and selected for inclusion in the book. The papers are organized in topical sections on Multilingual Textual Document Retrieval, Domain-Specifig Information Retrieval, i-CLEF, QA@CLEF, ImageCLEF, CLSR, WebCLEF and GeoCLEF.

Neural Approaches to Conversational Information Retrieval

Neural Approaches to Conversational Information Retrieval
Author: Jianfeng Gao
Publisher: Springer Nature
Total Pages: 217
Release: 2023-03-16
Genre: Computers
ISBN: 3031230809

This book surveys recent advances in Conversational Information Retrieval (CIR), focusing on neural approaches that have been developed in the last few years. Progress in deep learning has brought tremendous improvements in natural language processing (NLP) and conversational AI, leading to a plethora of commercial conversational services that allow naturally spoken and typed interaction, increasing the need for more human-centric interactions in IR. The book contains nine chapters. Chapter 1 motivates the research of CIR by reviewing the studies on how people search and subsequently defines a CIR system and a reference architecture which is described in detail in the rest of the book. Chapter 2 provides a detailed discussion of techniques for evaluating a CIR system – a goal-oriented conversational AI system with a human in the loop. Then Chapters 3 to 7 describe the algorithms and methods for developing the main CIR modules (or sub-systems). In Chapter 3, conversational document search is discussed, which can be viewed as a sub-system of the CIR system. Chapter 4 is about algorithms and methods for query-focused multi-document summarization. Chapter 5 describes various neural models for conversational machine comprehension, which generate a direct answer to a user query based on retrieved query-relevant documents, while Chapter 6 details neural approaches to conversational question answering over knowledge bases, which is fundamental to the knowledge base search module of a CIR system. Chapter 7 elaborates various techniques and models that aim to equip a CIR system with the capability of proactively leading a human-machine conversation. Chapter 8 reviews a variety of commercial systems for CIR and related tasks. It first presents an overview of research platforms and toolkits which enable scientists and practitioners to build conversational experiences, and continues with historical highlights and recent trends in a range of application areas. Chapter 9 eventually concludes the book with a brief discussion of research trends and areas for future work. The primary target audience of the book are the IR and NLP research communities. However, audiences with another background, such as machine learning or human-computer interaction, will also find it an accessible introduction to CIR.