Computational and Statistical Methods for Analysing Big Data with Applications

Computational and Statistical Methods for Analysing Big Data with Applications
Author: Shen Liu
Publisher: Academic Press
Total Pages: 208
Release: 2015-11-20
Genre: Mathematics
ISBN: 0081006519

Due to the scale and complexity of data sets currently being collected in areas such as health, transportation, environmental science, engineering, information technology, business and finance, modern quantitative analysts are seeking improved and appropriate computational and statistical methods to explore, model and draw inferences from big data. This book aims to introduce suitable approaches for such endeavours, providing applications and case studies for the purpose of demonstration. Computational and Statistical Methods for Analysing Big Data with Applications starts with an overview of the era of big data. It then goes onto explain the computational and statistical methods which have been commonly applied in the big data revolution. For each of these methods, an example is provided as a guide to its application. Five case studies are presented next, focusing on computer vision with massive training data, spatial data analysis, advanced experimental design methods for big data, big data in clinical medicine, and analysing data collected from mobile devices, respectively. The book concludes with some final thoughts and suggested areas for future research in big data. - Advanced computational and statistical methodologies for analysing big data are developed - Experimental design methodologies are described and implemented to make the analysis of big data more computationally tractable - Case studies are discussed to demonstrate the implementation of the developed methods - Five high-impact areas of application are studied: computer vision, geosciences, commerce, healthcare and transportation - Computing code/programs are provided where appropriate

Advanced Statistical Methods for the Analysis of Large Data-Sets

Advanced Statistical Methods for the Analysis of Large Data-Sets
Author: Agostino Di Ciaccio
Publisher: Springer Science & Business Media
Total Pages: 464
Release: 2012-03-14
Genre: Mathematics
ISBN: 3642210368

The theme of the meeting was “Statistical Methods for the Analysis of Large Data-Sets”. In recent years there has been increasing interest in this subject; in fact a huge quantity of information is often available but standard statistical techniques are usually not well suited to managing this kind of data. The conference serves as an important meeting point for European researchers working on this topic and a number of European statistical societies participated in the organization of the event. The book includes 45 papers from a selection of the 156 papers accepted for presentation and discussed at the conference on “Advanced Statistical Methods for the Analysis of Large Data-sets.”

Understanding Advanced Statistical Methods

Understanding Advanced Statistical Methods
Author: Peter Westfall
Publisher: CRC Press
Total Pages: 572
Release: 2013-04-09
Genre: Mathematics
ISBN: 1466512105

Providing a much-needed bridge between elementary statistics courses and advanced research methods courses, Understanding Advanced Statistical Methods helps students grasp the fundamental assumptions and machinery behind sophisticated statistical topics, such as logistic regression, maximum likelihood, bootstrapping, nonparametrics, and Bayesian methods. The book teaches students how to properly model, think critically, and design their own studies to avoid common errors. It leads them to think differently not only about math and statistics but also about general research and the scientific method. With a focus on statistical models as producers of data, the book enables students to more easily understand the machinery of advanced statistics. It also downplays the "population" interpretation of statistical models and presents Bayesian methods before frequentist ones. Requiring no prior calculus experience, the text employs a "just-in-time" approach that introduces mathematical topics, including calculus, where needed. Formulas throughout the text are used to explain why calculus and probability are essential in statistical modeling. The authors also intuitively explain the theory and logic behind real data analysis, incorporating a range of application examples from the social, economic, biological, medical, physical, and engineering sciences. Enabling your students to answer the why behind statistical methods, this text teaches them how to successfully draw conclusions when the premises are flawed. It empowers them to use advanced statistical methods with confidence and develop their own statistical recipes. Ancillary materials are available on the book’s website.

Making Sense of Statistical Methods in Social Research

Making Sense of Statistical Methods in Social Research
Author: Keming Yang
Publisher: SAGE
Total Pages: 218
Release: 2010-03-25
Genre: Social Science
ISBN: 1446205592

Making Sense of Statistical Methods in Social Research is a critical introduction to the use of statistical methods in social research. It provides a unique approach to statistics that concentrates on helping social researchers think about the conceptual basis for the statistical methods they′re using. Whereas other statistical methods books instruct students in how to get through the statistics-based elements of their chosen course with as little mathematical knowledge as possible, this book aims to improve students′ statistical literacy, with the ultimate goal of turning them into competent researchers. Making Sense of Statistical Methods in Social Research contains careful discussion of the conceptual foundation of statistical methods, specifying what questions they can, or cannot, answer. The logic of each statistical method or procedure is explained, drawing on the historical development of the method, existing publications that apply the method, and methodological discussions. Statistical techniques and procedures are presented not for the purpose of showing how to produce statistics with certain software packages, but as a way of illuminating the underlying logic behind the symbols. The limited statistical knowledge that students gain from straight forward ′how-to′ books makes it very hard for students to move beyond introductory statistics courses to postgraduate study and research. This book should help to bridge this gap.

Statistical Learning for Big Dependent Data

Statistical Learning for Big Dependent Data
Author: Daniel Peña
Publisher: John Wiley & Sons
Total Pages: 562
Release: 2021-05-04
Genre: Mathematics
ISBN: 1119417384

Master advanced topics in the analysis of large, dynamically dependent datasets with this insightful resource Statistical Learning with Big Dependent Data delivers a comprehensive presentation of the statistical and machine learning methods useful for analyzing and forecasting large and dynamically dependent data sets. The book presents automatic procedures for modelling and forecasting large sets of time series data. Beginning with some visualization tools, the book discusses procedures and methods for finding outliers, clusters, and other types of heterogeneity in big dependent data. It then introduces various dimension reduction methods, including regularization and factor models such as regularized Lasso in the presence of dynamical dependence and dynamic factor models. The book also covers other forecasting procedures, including index models, partial least squares, boosting, and now-casting. It further presents machine-learning methods, including neural network, deep learning, classification and regression trees and random forests. Finally, procedures for modelling and forecasting spatio-temporal dependent data are also presented. Throughout the book, the advantages and disadvantages of the methods discussed are given. The book uses real-world examples to demonstrate applications, including use of many R packages. Finally, an R package associated with the book is available to assist readers in reproducing the analyses of examples and to facilitate real applications. Analysis of Big Dependent Data includes a wide variety of topics for modeling and understanding big dependent data, like: New ways to plot large sets of time series An automatic procedure to build univariate ARMA models for individual components of a large data set Powerful outlier detection procedures for large sets of related time series New methods for finding the number of clusters of time series and discrimination methods , including vector support machines, for time series Broad coverage of dynamic factor models including new representations and estimation methods for generalized dynamic factor models Discussion on the usefulness of lasso with time series and an evaluation of several machine learning procedure for forecasting large sets of time series Forecasting large sets of time series with exogenous variables, including discussions of index models, partial least squares, and boosting. Introduction of modern procedures for modeling and forecasting spatio-temporal data Perfect for PhD students and researchers in business, economics, engineering, and science: Statistical Learning with Big Dependent Data also belongs to the bookshelves of practitioners in these fields who hope to improve their understanding of statistical and machine learning methods for analyzing and forecasting big dependent data.

Federal Statistics, Multiple Data Sources, and Privacy Protection

Federal Statistics, Multiple Data Sources, and Privacy Protection
Author: National Academies of Sciences, Engineering, and Medicine
Publisher: National Academies Press
Total Pages: 195
Release: 2018-01-27
Genre: Social Science
ISBN: 0309465370

The environment for obtaining information and providing statistical data for policy makers and the public has changed significantly in the past decade, raising questions about the fundamental survey paradigm that underlies federal statistics. New data sources provide opportunities to develop a new paradigm that can improve timeliness, geographic or subpopulation detail, and statistical efficiency. It also has the potential to reduce the costs of producing federal statistics. The panel's first report described federal statistical agencies' current paradigm, which relies heavily on sample surveys for producing national statistics, and challenges agencies are facing; the legal frameworks and mechanisms for protecting the privacy and confidentiality of statistical data and for providing researchers access to data, and challenges to those frameworks and mechanisms; and statistical agencies access to alternative sources of data. The panel recommended a new approach for federal statistical programs that would combine diverse data sources from government and private sector sources and the creation of a new entity that would provide the foundational elements needed for this new approach, including legal authority to access data and protect privacy. This second of the panel's two reports builds on the analysis, conclusions, and recommendations in the first one. This report assesses alternative methods for implementing a new approach that would combine diverse data sources from government and private sector sources, including describing statistical models for combining data from multiple sources; examining statistical and computer science approaches that foster privacy protections; evaluating frameworks for assessing the quality and utility of alternative data sources; and various models for implementing the recommended new entity. Together, the two reports offer ideas and recommendations to help federal statistical agencies examine and evaluate data from alternative sources and then combine them as appropriate to provide the country with more timely, actionable, and useful information for policy makers, businesses, and individuals.

Statistical Methods for Survival Data Analysis

Statistical Methods for Survival Data Analysis
Author: Elisa T. Lee
Publisher: John Wiley & Sons
Total Pages: 0
Release: 2013-10-07
Genre: Mathematics
ISBN: 9781118095027

Praise for the Third Edition “. . . an easy-to read introduction to survival analysis which covers the major concepts and techniques of the subject.” —Statistics in Medical Research Updated and expanded to reflect the latest developments, Statistical Methods for Survival Data Analysis, Fourth Edition continues to deliver a comprehensive introduction to the most commonly-used methods for analyzing survival data. Authored by a uniquely well-qualified author team, the Fourth Edition is a critically acclaimed guide to statistical methods with applications in clinical trials, epidemiology, areas of business, and the social sciences. The book features many real-world examples to illustrate applications within these various fields, although special consideration is given to the study of survival data in biomedical sciences. Emphasizing the latest research and providing the most up-to-date information regarding software applications in the field, Statistical Methods for Survival Data Analysis, Fourth Edition also includes: Marginal and random effect models for analyzing correlated censored or uncensored data Multiple types of two-sample and K-sample comparison analysis Updated treatment of parametric methods for regression model fitting with a new focus on accelerated failure time models Expanded coverage of the Cox proportional hazards model Exercises at the end of each chapter to deepen knowledge of the presented material Statistical Methods for Survival Data Analysis is an ideal text for upper-undergraduate and graduate-level courses on survival data analysis. The book is also an excellent resource for biomedical investigators, statisticians, and epidemiologists, as well as researchers in every field in which the analysis of survival data plays a role.

Statistical Methods for the Social and Behavioural Sciences

Statistical Methods for the Social and Behavioural Sciences
Author: David B. Flora
Publisher: SAGE
Total Pages: 786
Release: 2017-12-11
Genre: Social Science
ISBN: 1526421925

Statistical methods in modern research increasingly entail developing, estimating and testing models for data. Rather than rigid methods of data analysis, the need today is for more flexible methods for modelling data. In this logical, easy-to-follow and exceptionally clear book, David Flora provides a comprehensive survey of the major statistical procedures currently used. His innovative model-based approach teaches you how to: Understand and choose the right statistical model to fit your data Match substantive theory and statistical models Apply statistical procedures hands-on, with example data analyses Develop and use graphs to understand data and fit models to data Work with statistical modeling principles using any software package Learn by applying, with input and output files for R, SAS, SPSS, and Mplus. Statistical Methods for the Social and Behavioural Sciences: A Model Based Approach is the essential guide for those looking to extend their understanding of the principles of statistics, and begin using the right statistical modeling method for their own data. It is particularly suited to second or advanced courses in statistical methods across the social and behavioural sciences.

New Advances in Statistics and Data Science

New Advances in Statistics and Data Science
Author: Ding-Geng Chen
Publisher: Springer
Total Pages: 355
Release: 2018-01-17
Genre: Mathematics
ISBN: 3319694162

This book is comprised of the presentations delivered at the 25th ICSA Applied Statistics Symposium held at the Hyatt Regency Atlanta, on June 12-15, 2016. This symposium attracted more than 700 statisticians and data scientists working in academia, government, and industry from all over the world. The theme of this conference was the “Challenge of Big Data and Applications of Statistics,” in recognition of the advent of big data era, and the symposium offered opportunities for learning, receiving inspirations from old research ideas and for developing new ones, and for promoting further research collaborations in the data sciences. The invited contributions addressed rich topics closely related to big data analysis in the data sciences, reflecting recent advances and major challenges in statistics, business statistics, and biostatistics. Subsequently, the six editors selected 19 high-quality presentations and invited the speakers to prepare full chapters for this book, which showcases new methods in statistics and data sciences, emerging theories, and case applications from statistics, data science and interdisciplinary fields. The topics covered in the book are timely and have great impact on data sciences, identifying important directions for future research, promoting advanced statistical methods in big data science, and facilitating future collaborations across disciplines and between theory and practice.