Applied Text Analysis with Python

Applied Text Analysis with Python
Author: Benjamin Bengfort
Publisher: "O'Reilly Media, Inc."
Total Pages: 328
Release: 2018-06-11
Genre: Computers
ISBN: 1491962992

From news and speeches to informal chatter on social media, natural language is one of the richest and most underutilized sources of data. Not only does it come in a constant stream, always changing and adapting in context; it also contains information that is not conveyed by traditional data sources. The key to unlocking natural language is through the creative application of text analytics. This practical book presents a data scientist’s approach to building language-aware products with applied machine learning. You’ll learn robust, repeatable, and scalable techniques for text analysis with Python, including contextual and linguistic feature engineering, vectorization, classification, topic modeling, entity resolution, graph analysis, and visual steering. By the end of the book, you’ll be equipped with practical methods to solve any number of complex real-world problems. Preprocess and vectorize text into high-dimensional feature representations Perform document classification and topic modeling Steer the model selection process with visual diagnostics Extract key phrases, named entities, and graph structures to reason about data in text Build a dialog framework to enable chatbots and language-driven interaction Use Spark to scale processing power and neural networks to scale model complexity

Text Mining with R

Text Mining with R
Author: Julia Silge
Publisher: "O'Reilly Media, Inc."
Total Pages: 193
Release: 2017-06-12
Genre: Computers
ISBN: 1491981628

Chapter 7. Case Study : Comparing Twitter Archives; Getting the Data and Distribution of Tweets; Word Frequencies; Comparing Word Usage; Changes in Word Use; Favorites and Retweets; Summary; Chapter 8. Case Study : Mining NASA Metadata; How Data Is Organized at NASA; Wrangling and Tidying the Data; Some Initial Simple Exploration; Word Co-ocurrences and Correlations; Networks of Description and Title Words; Networks of Keywords; Calculating tf-idf for the Description Fields; What Is tf-idf for the Description Field Words?; Connecting Description Fields to Keywords; Topic Modeling.

Textual Analysis

Textual Analysis
Author: Alan McKee
Publisher: SAGE
Total Pages: 178
Release: 2003-04-03
Genre: Business & Economics
ISBN: 9780761949930

Textual analysis is a methodology - a way of gathering data - for researchers who are interested in the ways in which people make sense of the world.

Text Analysis with R

Text Analysis with R
Author: Matthew L. Jockers
Publisher: Springer Nature
Total Pages: 277
Release: 2020-03-30
Genre: Computers
ISBN: 3030396436

Now in its second edition, Text Analysis with R provides a practical introduction to computational text analysis using the open source programming language R. R is an extremely popular programming language, used throughout the sciences; due to its accessibility, R is now used increasingly in other research areas. In this volume, readers immediately begin working with text, and each chapter examines a new technique or process, allowing readers to obtain a broad exposure to core R procedures and a fundamental understanding of the possibilities of computational text analysis at both the micro and the macro scale. Each chapter builds on its predecessor as readers move from small scale “microanalysis” of single texts to large scale “macroanalysis” of text corpora, and each concludes with a set of practice exercises that reinforce and expand upon the chapter lessons. The book’s focus is on making the technical palatable and making the technical useful and immediately gratifying. Text Analysis with R is written with students and scholars of literature in mind but will be applicable to other humanists and social scientists wishing to extend their methodological toolkit to include quantitative and computational approaches to the study of text. Computation provides access to information in text that readers simply cannot gather using traditional qualitative methods of close reading and human synthesis. This new edition features two new chapters: one that introduces dplyr and tidyr in the context of parsing and analyzing dramatic texts to extract speaker and receiver data, and one on sentiment analysis using the syuzhet package. It is also filled with updated material in every chapter to integrate new developments in the field, current practices in R style, and the use of more efficient algorithms.

Supervised Machine Learning for Text Analysis in R

Supervised Machine Learning for Text Analysis in R
Author: Emil Hvitfeldt
Publisher: CRC Press
Total Pages: 402
Release: 2021-10-22
Genre: Computers
ISBN: 1000461971

Text data is important for many domains, from healthcare to marketing to the digital humanities, but specialized approaches are necessary to create features for machine learning from language. Supervised Machine Learning for Text Analysis in R explains how to preprocess text data for modeling, train models, and evaluate model performance using tools from the tidyverse and tidymodels ecosystem. Models like these can be used to make predictions for new observations, to understand what natural language features or characteristics contribute to differences in the output, and more. If you are already familiar with the basics of predictive modeling, use the comprehensive, detailed examples in this book to extend your skills to the domain of natural language processing. This book provides practical guidance and directly applicable knowledge for data scientists and analysts who want to integrate unstructured text data into their modeling pipelines. Learn how to use text data for both regression and classification tasks, and how to apply more straightforward algorithms like regularized regression or support vector machines as well as deep learning approaches. Natural language must be dramatically transformed to be ready for computation, so we explore typical text preprocessing and feature engineering steps like tokenization and word embeddings from the ground up. These steps influence model results in ways we can measure, both in terms of model metrics and other tangible consequences such as how fair or appropriate model results are.

Qualitative Text Analysis

Qualitative Text Analysis
Author: Udo Kuckartz
Publisher: SAGE
Total Pages: 193
Release: 2014-01-23
Genre: Reference
ISBN: 1446297764

How can you analyse narratives, interviews, field notes, or focus group data? Qualitative text analysis is ideal for these types of data and this textbook provides a hands-on introduction to the method and its theoretical underpinnings. It offers step-by-step instructions for implementing the three principal types of qualitative text analysis: thematic, evaluative, and type-building. Special attention is paid to how to present your results and use qualitative data analysis software packages, which are highly recommended for use in combination with qualitative text analysis since they allow for fast, reliable, and more accurate analysis. The book shows in detail how to use software, from transcribing the verbal data to presenting and visualizing the results. The book is intended for Master’s and Doctoral students across the social sciences and for all researchers concerned with the systematic analysis of texts of any kind.

An Introduction to Text Mining

An Introduction to Text Mining
Author: Gabe Ignatow
Publisher: SAGE Publications
Total Pages: 345
Release: 2017-09-22
Genre: Computers
ISBN: 150633699X

Students in social science courses communicate, socialize, shop, learn, and work online. When they are asked to collect data for course projects they are often drawn to social media platforms and other online sources of textual data. There are many software packages and programming languages available to help students collect data online, and there are many texts designed to help with different forms of online research, from surveys to ethnographic interviews. But there is no textbook available that teaches students how to construct a viable research project based on online sources of textual data such as newspaper archives, site user comment archives, digitized historical documents, or social media user comment archives. Gabe Ignatow and Rada F. Mihalcea's new text An Introduction to Text Mining will be a starting point for undergraduates and first-year graduate students interested in collecting and analyzing textual data from online sources, and will cover the most critical issues that students must take into consideration at all stages of their research projects, including: ethical and philosophical issues; issues related to research design; web scraping and crawling; strategic data selection; data sampling; use of specific text analysis methods; and report writing.

Text Analysis in Translation

Text Analysis in Translation
Author: Christiane Nord
Publisher: BRILL
Total Pages: 284
Release: 2006-01-01
Genre: Language Arts & Disciplines
ISBN: 900450091X

Text Analysis in Translation has become a classic in Translation Studies. Based on a functional approach to translation and endebted to pragmatic text linguistics, it suggests a model for translation-oriented source-text analysis applicable to all text types and genres independent of the language and culture pairs involved. Part 1 of the study presents the theoretical framework on which the model is based, and surveys the various concepts of translation theory and text linguistics. Part 2 describes the role and scope of source-text analysis in the translation process and explains why the model is relevant to translation. Part 3 presents a detailed study of the extratextual and intratextual factors and their interaction in the text, using numerous examples from all areas of professional translation. Part 4 discusses the applications of the model to translator training, placing particular emphasis on the selection of material for translation classes, grading the difficulty of translation tasks, and translation quality assessment. The book concludes with the practical analysis of a number of texts and their translations, taking into account various text types and several languages (German, English, Spanish, French, Italian, Portuguese, and Dutch).

Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications

Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications
Author: Gary Miner
Publisher: Academic Press
Total Pages: 1096
Release: 2012-01-11
Genre: Computers
ISBN: 012386979X

"The world contains an unimaginably vast amount of digital information which is getting ever vaster ever more rapidly. This makes it possible to do many things that previously could not be done: spot business trends, prevent diseases, combat crime and so on. Managed well, the textual data can be used to unlock new sources of economic value, provide fresh insights into science and hold governments to account. As the Internet expands and our natural capacity to process the unstructured text that it contains diminishes, the value of text mining for information retrieval and search will increase dramatically. This comprehensive professional reference brings together all the information, tools and methods a professional will need to efficiently use text mining applications and statistical analysis. The Handbook of Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications presents a comprehensive how- to reference that shows the user how to conduct text mining and statistically analyze results. In addition to providing an in-depth examination of core text mining and link detection tools, methods and operations, the book examines advanced preprocessing techniques, knowledge representation considerations, and visualization approaches. Finally, the book explores current real-world, mission-critical applications of text mining and link detection using real world example tutorials in such varied fields as corporate, finance, business intelligence, genomics research, and counterterrorism activities"--

Natural Language Processing and Text Mining

Natural Language Processing and Text Mining
Author: Anne Kao
Publisher: Springer Science & Business Media
Total Pages: 272
Release: 2007-03-06
Genre: Computers
ISBN: 1846287545

Natural Language Processing and Text Mining not only discusses applications of Natural Language Processing techniques to certain Text Mining tasks, but also the converse, the use of Text Mining to assist NLP. It assembles a diverse views from internationally recognized researchers and emphasizes caveats in the attempt to apply Natural Language Processing to text mining. This state-of-the-art survey is a must-have for advanced students, professionals, and researchers.