A Computational Approach to Statistical Learning

A Computational Approach to Statistical Learning
Author: Taylor Arnold
Publisher: CRC Press
Total Pages: 377
Release: 2019-01-23
Genre: Business & Economics
ISBN: 1351694766

A Computational Approach to Statistical Learning gives a novel introduction to predictive modeling by focusing on the algorithmic and numeric motivations behind popular statistical methods. The text contains annotated code to over 80 original reference functions. These functions provide minimal working implementations of common statistical learning algorithms. Every chapter concludes with a fully worked out application that illustrates predictive modeling tasks using a real-world dataset. The text begins with a detailed analysis of linear models and ordinary least squares. Subsequent chapters explore extensions such as ridge regression, generalized linear models, and additive models. The second half focuses on the use of general-purpose algorithms for convex optimization and their application to tasks in statistical learning. Models covered include the elastic net, dense neural networks, convolutional neural networks (CNNs), and spectral clustering. A unifying theme throughout the text is the use of optimization theory in the description of predictive models, with a particular focus on the singular value decomposition (SVD). Through this theme, the computational approach motivates and clarifies the relationships between various predictive models. Taylor Arnold is an assistant professor of statistics at the University of Richmond. His work at the intersection of computer vision, natural language processing, and digital humanities has been supported by multiple grants from the National Endowment for the Humanities (NEH) and the American Council of Learned Societies (ACLS). His first book, Humanities Data in R, was published in 2015. Michael Kane is an assistant professor of biostatistics at Yale University. He is the recipient of grants from the National Institutes of Health (NIH), DARPA, and the Bill and Melinda Gates Foundation. His R package bigmemory won the Chamber's prize for statistical software in 2010. Bryan Lewis is an applied mathematician and author of many popular R packages, including irlba, doRedis, and threejs.

Computational Learning Approaches to Data Analytics in Biomedical Applications

Computational Learning Approaches to Data Analytics in Biomedical Applications
Author: Khalid Al-Jabery
Publisher: Academic Press
Total Pages: 312
Release: 2019-11-20
Genre: Technology & Engineering
ISBN: 0128144831

Computational Learning Approaches to Data Analytics in Biomedical Applications provides a unified framework for biomedical data analysis using varied machine learning and statistical techniques. It presents insights on biomedical data processing, innovative clustering algorithms and techniques, and connections between statistical analysis and clustering. The book introduces and discusses the major problems relating to data analytics, provides a review of influential and state-of-the-art learning algorithms for biomedical applications, reviews cluster validity indices and how to select the appropriate index, and includes an overview of statistical methods that can be applied to increase confidence in the clustering framework and analysis of the results obtained. - Includes an overview of data analytics in biomedical applications and current challenges - Updates on the latest research in supervised learning algorithms and applications, clustering algorithms and cluster validation indices - Provides complete coverage of computational and statistical analysis tools for biomedical data analysis - Presents hands-on training on the use of Python libraries, MATLAB® tools, WEKA, SAP-HANA and R/Bioconductor

Information Theory and Statistical Learning

Information Theory and Statistical Learning
Author: Frank Emmert-Streib
Publisher: Springer Science & Business Media
Total Pages: 443
Release: 2009
Genre: Computers
ISBN: 0387848150

This interdisciplinary text offers theoretical and practical results of information theoretic methods used in statistical learning. It presents a comprehensive overview of the many different methods that have been developed in numerous contexts.

Algebraic Geometry and Statistical Learning Theory

Algebraic Geometry and Statistical Learning Theory
Author: Sumio Watanabe
Publisher: Cambridge University Press
Total Pages: 295
Release: 2009-08-13
Genre: Computers
ISBN: 0521864674

Sure to be influential, Watanabe's book lays the foundations for the use of algebraic geometry in statistical learning theory. Many models/machines are singular: mixture models, neural networks, HMMs, Bayesian networks, stochastic context-free grammars are major examples. The theory achieved here underpins accurate estimation techniques in the presence of singularities.

The Elements of Statistical Learning

The Elements of Statistical Learning
Author: Trevor Hastie
Publisher: Springer Science & Business Media
Total Pages: 545
Release: 2013-11-11
Genre: Mathematics
ISBN: 0387216065

During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book’s coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for “wide” data (p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.

Boosting

Boosting
Author: Robert E. Schapire
Publisher: MIT Press
Total Pages: 544
Release: 2014-01-10
Genre: Computers
ISBN: 0262526034

An accessible introduction and essential reference for an approach to machine learning that creates highly accurate prediction rules by combining many weak and inaccurate ones. Boosting is an approach to machine learning based on the idea of creating a highly accurate predictor by combining many weak and inaccurate “rules of thumb.” A remarkably rich theory has evolved around boosting, with connections to a range of topics, including statistics, game theory, convex optimization, and information geometry. Boosting algorithms have also enjoyed practical success in such fields as biology, vision, and speech processing. At various times in its history, boosting has been perceived as mysterious, controversial, even paradoxical. This book, written by the inventors of the method, brings together, organizes, simplifies, and substantially extends two decades of research on boosting, presenting both theory and applications in a way that is accessible to readers from diverse backgrounds while also providing an authoritative reference for advanced researchers. With its introductory treatment of all material and its inclusion of exercises in every chapter, the book is appropriate for course use as well. The book begins with a general introduction to machine learning algorithms and their analysis; then explores the core theory of boosting, especially its ability to generalize; examines some of the myriad other theoretical viewpoints that help to explain and understand boosting; provides practical extensions of boosting for more complex learning problems; and finally presents a number of advanced theoretical topics. Numerous applications and practical illustrations are offered throughout.

Computational Statistics

Computational Statistics
Author: James E. Gentle
Publisher: Springer Science & Business Media
Total Pages: 732
Release: 2009-07-28
Genre: Mathematics
ISBN: 0387981446

Computational inference is based on an approach to statistical methods that uses modern computational power to simulate distributional properties of estimators and test statistics. This book describes computationally intensive statistical methods in a unified presentation, emphasizing techniques, such as the PDF decomposition, that arise in a wide range of methods.

Computational and Statistical Methods for Analysing Big Data with Applications

Computational and Statistical Methods for Analysing Big Data with Applications
Author: Shen Liu
Publisher: Academic Press
Total Pages: 208
Release: 2015-11-20
Genre: Mathematics
ISBN: 0081006519

Due to the scale and complexity of data sets currently being collected in areas such as health, transportation, environmental science, engineering, information technology, business and finance, modern quantitative analysts are seeking improved and appropriate computational and statistical methods to explore, model and draw inferences from big data. This book aims to introduce suitable approaches for such endeavours, providing applications and case studies for the purpose of demonstration. Computational and Statistical Methods for Analysing Big Data with Applications starts with an overview of the era of big data. It then goes onto explain the computational and statistical methods which have been commonly applied in the big data revolution. For each of these methods, an example is provided as a guide to its application. Five case studies are presented next, focusing on computer vision with massive training data, spatial data analysis, advanced experimental design methods for big data, big data in clinical medicine, and analysing data collected from mobile devices, respectively. The book concludes with some final thoughts and suggested areas for future research in big data. - Advanced computational and statistical methodologies for analysing big data are developed - Experimental design methodologies are described and implemented to make the analysis of big data more computationally tractable - Case studies are discussed to demonstrate the implementation of the developed methods - Five high-impact areas of application are studied: computer vision, geosciences, commerce, healthcare and transportation - Computing code/programs are provided where appropriate

Structural Reliability

Structural Reliability
Author: Jorge Eduardo Hurtado
Publisher: Springer Science & Business Media
Total Pages: 267
Release: 2013-11-11
Genre: Technology & Engineering
ISBN: 3540409874

The last decades have witnessed the development of methods for solving struc tural reliability problems, which emerged from the efforts of numerous re searchers all over the world. For the specific and most common problem of determining the probability of failure of a structural system in which the limit state function g( x) = 0 is only implicitly known, the proposed methods can be grouped into two main categories: • Methods based on the Taylor expansion of the performance function g(x) about the most likely failure point (the design point), which is determined in the solution process. These methods are known as FORM and SORM (First- and Second Order Reliability Methods, respectively). • Monte Carlo methods, which require repeated calls of the numerical (nor mally finite element) solver of the structural model using a random real ization of the basic variable set x each time. In the first category of methods only SORM can be considered of a wide applicability. However, it requires the knowledge of the first and second deriva tives of the performance function, whose calculation in several dimensions either implies a high computational effort when faced with finite difference techniques or special programs when using perturbation techniques, which nevertheless require the use of large matrices in their computations. In or der to simplify this task, use has been proposed of techniques that can be regarded as variants of the Response Surface Method.

Statistical Modeling and Machine Learning for Molecular Biology

Statistical Modeling and Machine Learning for Molecular Biology
Author: Alan Moses
Publisher: CRC Press
Total Pages: 281
Release: 2017-01-06
Genre: Computers
ISBN: 1482258609

• Assumes no background in statistics or computers • Covers most major types of molecular biological data • Covers the statistical and machine learning concepts of most practical utility (P-values, clustering, regression, regularization and classification) • Intended for graduate students beginning careers in molecular biology, systems biology, bioengineering and genetics