Model-Based Clustering, Classification, and Density Estimation Using mclust in R

Model-Based Clustering, Classification, and Density Estimation Using mclust in R
Author: Luca Scrucca
Publisher: CRC Press
Total Pages: 269
Release: 2023-04-20
Genre: Mathematics
ISBN: 1000868346

Model-Based Clustering, Classification, and Denisty Estimation Using mclust in R Model-based clustering and classification methods provide a systematic statistical approach to clustering, classification, and density estimation via mixture modeling. The model-based framework allows the problems of choosing or developing an appropriate clustering or classification method to be understood within the context of statistical modeling. The mclust package for the statistical environment R is a widely adopted platform implementing these model-based strategies. The package includes both summary and visual functionality, complementing procedures for estimating and choosing models. Key features of the book: An introduction to the model-based approach and the mclust R package A detailed description of mclust and the underlying modeling strategies An extensive set of examples, color plots, and figures along with the R code for reproducing them Supported by a companion website, including the R code to reproduce the examples and figures presented in the book, errata, and other supplementary material Model-Based Clustering, Classification, and Density Estimation Using mclust in R is accessible to quantitatively trained students and researchers with a basic understanding of statistical methods, including inference and computing. In addition to serving as a reference manual for mclust, the book will be particularly useful to those wishing to employ these model-based techniques in research or applications in statistics, data science, clinical research, social science, and many other disciplines.

Finite Mixture Models

Finite Mixture Models
Author: Geoffrey McLachlan
Publisher: John Wiley & Sons
Total Pages: 419
Release: 2004-03-22
Genre: Mathematics
ISBN: 047165406X

An up-to-date, comprehensive account of major issues in finitemixture modeling This volume provides an up-to-date account of the theory andapplications of modeling via finite mixture distributions. With anemphasis on the applications of mixture models in both mainstreamanalysis and other areas such as unsupervised pattern recognition,speech recognition, and medical imaging, the book describes theformulations of the finite mixture approach, details itsmethodology, discusses aspects of its implementation, andillustrates its application in many common statisticalcontexts. Major issues discussed in this book include identifiabilityproblems, actual fitting of finite mixtures through use of the EMalgorithm, properties of the maximum likelihood estimators soobtained, assessment of the number of components to be used in themixture, and the applicability of asymptotic theory in providing abasis for the solutions to some of these problems. The author alsoconsiders how the EM algorithm can be scaled to handle the fittingof mixture models to very large databases, as in data miningapplications. This comprehensive, practical guide: * Provides more than 800 references-40% published since 1995 * Includes an appendix listing available mixture software * Links statistical literature with machine learning and patternrecognition literature * Contains more than 100 helpful graphs, charts, and tables Finite Mixture Models is an important resource for both applied andtheoretical statisticians as well as for researchers in the manyareas in which finite mixture models can be used to analyze data.

Model-Based Clustering and Classification for Data Science

Model-Based Clustering and Classification for Data Science
Author: Charles Bouveyron
Publisher: Cambridge University Press
Total Pages: 447
Release: 2019-07-25
Genre: Mathematics
ISBN: 1108640591

Cluster analysis finds groups in data automatically. Most methods have been heuristic and leave open such central questions as: how many clusters are there? Which method should I use? How should I handle outliers? Classification assigns new observations to groups given previously classified observations, and also has open questions about parameter tuning, robustness and uncertainty assessment. This book frames cluster analysis and classification in terms of statistical models, thus yielding principled estimation, testing and prediction methods, and sound answers to the central questions. It builds the basic ideas in an accessible but rigorous way, with extensive data examples and R code; describes modern approaches to high-dimensional data and networks; and explains such recent advances as Bayesian regularization, non-Gaussian model-based clustering, cluster merging, variable selection, semi-supervised and robust classification, clustering of functional data, text and images, and co-clustering. Written for advanced undergraduates in data science, as well as researchers and practitioners, it assumes basic knowledge of multivariate calculus, linear algebra, probability and statistics.

Hands-On Machine Learning with R

Hands-On Machine Learning with R
Author: Brad Boehmke
Publisher: CRC Press
Total Pages: 373
Release: 2019-11-07
Genre: Business & Economics
ISBN: 1000730433

Hands-on Machine Learning with R provides a practical and applied approach to learning and developing intuition into today’s most popular machine learning methods. This book serves as a practitioner’s guide to the machine learning process and is meant to help the reader learn to apply the machine learning stack within R, which includes using various R packages such as glmnet, h2o, ranger, xgboost, keras, and others to effectively model and gain insight from their data. The book favors a hands-on approach, providing an intuitive understanding of machine learning concepts through concrete examples and just a little bit of theory. Throughout this book, the reader will be exposed to the entire machine learning process including feature engineering, resampling, hyperparameter tuning, model evaluation, and interpretation. The reader will be exposed to powerful algorithms such as regularized regression, random forests, gradient boosting machines, deep learning, generalized low rank models, and more! By favoring a hands-on approach and using real word data, the reader will gain an intuitive understanding of the architectures and engines that drive these algorithms and packages, understand when and how to tune the various hyperparameters, and be able to interpret model results. By the end of this book, the reader should have a firm grasp of R’s machine learning stack and be able to implement a systematic approach for producing high quality modeling results. Features: · Offers a practical and applied introduction to the most popular machine learning methods. · Topics covered include feature engineering, resampling, deep learning and more. · Uses a hands-on approach and real world data.

An Introduction to Clustering with R

An Introduction to Clustering with R
Author: Paolo Giordani
Publisher: Springer Nature
Total Pages: 340
Release: 2020-08-27
Genre: Mathematics
ISBN: 9811305536

The purpose of this book is to thoroughly prepare the reader for applied research in clustering. Cluster analysis comprises a class of statistical techniques for classifying multivariate data into groups or clusters based on their similar features. Clustering is nowadays widely used in several domains of research, such as social sciences, psychology, and marketing, highlighting its multidisciplinary nature. This book provides an accessible and comprehensive introduction to clustering and offers practical guidelines for applying clustering tools by carefully chosen real-life datasets and extensive data analyses. The procedures addressed in this book include traditional hard clustering methods and up-to-date developments in soft clustering. Attention is paid to practical examples and applications through the open source statistical software R. Commented R code and output for conducting, step by step, complete cluster analyses are available. The book is intended for researchers interested in applying clustering methods. Basic notions on theoretical issues and on R are provided so that professionals as well as novices with little or no background in the subject will benefit from the book.

Mixture Model-Based Classification

Mixture Model-Based Classification
Author: Paul D. McNicholas
Publisher: CRC Press
Total Pages: 212
Release: 2016-10-04
Genre: Mathematics
ISBN: 1482225670

"This is a great overview of the field of model-based clustering and classification by one of its leading developers. McNicholas provides a resource that I am certain will be used by researchers in statistics and related disciplines for quite some time. The discussion of mixtures with heavy tails and asymmetric distributions will place this text as the authoritative, modern reference in the mixture modeling literature." (Douglas Steinley, University of Missouri) Mixture Model-Based Classification is the first monograph devoted to mixture model-based approaches to clustering and classification. This is both a book for established researchers and newcomers to the field. A history of mixture models as a tool for classification is provided and Gaussian mixtures are considered extensively, including mixtures of factor analyzers and other approaches for high-dimensional data. Non-Gaussian mixtures are considered, from mixtures with components that parameterize skewness and/or concentration, right up to mixtures of multiple scaled distributions. Several other important topics are considered, including mixture approaches for clustering and classification of longitudinal data as well as discussion about how to define a cluster Paul D. McNicholas is the Canada Research Chair in Computational Statistics at McMaster University, where he is a Professor in the Department of Mathematics and Statistics. His research focuses on the use of mixture model-based approaches for classification, with particular attention to clustering applications, and he has published extensively within the field. He is an associate editor for several journals and has served as a guest editor for a number of special issues on mixture models.

Clustering and Classification

Clustering and Classification
Author: Phipps Arabie
Publisher: World Scientific
Total Pages: 508
Release: 1996
Genre: Mathematics
ISBN: 9789810212872

At a moderately advanced level, this book seeks to cover the areas of clustering and related methods of data analysis where major advances are being made. Topics include: hierarchical clustering, variable selection and weighting, additive trees and other network models, relevance of neural network models to clustering, the role of computational complexity in cluster analysis, latent class approaches to cluster analysis, theory and method with applications of a hierarchical classes model in psychology and psychopathology, combinatorial data analysis, clusterwise aggregation of relations, review of the Japanese-language results on clustering, review of the Russian-language results on clustering and multidimensional scaling, practical advances, and significance tests.

Matrix Variate Distributions

Matrix Variate Distributions
Author: A K Gupta
Publisher: CRC Press
Total Pages: 382
Release: 2018-05-02
Genre: Mathematics
ISBN: 1351433008

Useful in physics, economics, psychology, and other fields, random matrices play an important role in the study of multivariate statistical methods. Until now, however, most of the material on random matrices could only be found scattered in various statistical journals. Matrix Variate Distributions gathers and systematically presents most of the recent developments in continuous matrix variate distribution theory and includes new results. After a review of the essential background material, the authors investigate the range of matrix variate distributions, including: matrix variate normal distribution Wishart distribution Matrix variate t-distribution Matrix variate beta distribution F-distribution Matrix variate Dirichlet distribution Matrix quadratic forms With its inclusion of new results, Matrix Variate Distributions promises to stimulate further research and help advance the field of multivariate statistical analysis.

Applied Compositional Data Analysis

Applied Compositional Data Analysis
Author: Peter Filzmoser
Publisher: Springer
Total Pages: 288
Release: 2018-11-03
Genre: Mathematics
ISBN: 3319964224

This book presents the statistical analysis of compositional data using the log-ratio approach. It includes a wide range of classical and robust statistical methods adapted for compositional data analysis, such as supervised and unsupervised methods like PCA, correlation analysis, classification and regression. In addition, it considers special data structures like high-dimensional compositions and compositional tables. The methodology introduced is also frequently compared to methods which ignore the specific nature of compositional data. It focuses on practical aspects of compositional data analysis rather than on detailed theoretical derivations, thus issues like graphical visualization and preprocessing (treatment of missing values, zeros, outliers and similar artifacts) form an important part of the book. Since it is primarily intended for researchers and students from applied fields like geochemistry, chemometrics, biology and natural sciences, economics, and social sciences, all the proposed methods are accompanied by worked-out examples in R using the package robCompositions.

Market Segmentation Analysis

Market Segmentation Analysis
Author: Sara Dolnicar
Publisher: Springer
Total Pages: 332
Release: 2018-07-20
Genre: Business & Economics
ISBN: 9811088187

This book is published open access under a CC BY 4.0 license. This open access book offers something for everyone working with market segmentation: practical guidance for users of market segmentation solutions; organisational guidance on implementation issues; guidance for market researchers in charge of collecting suitable data; and guidance for data analysts with respect to the technical and statistical aspects of market segmentation analysis. Even market segmentation experts will find something new, including an approach to exploring data structure and choosing a suitable number of market segments, and a vast array of useful visualisation techniques that make interpretation of market segments and selection of target segments easier. The book talks the reader through every single step, every single potential pitfall, and every single decision that needs to be made to ensure market segmentation analysis is conducted as well as possible. All calculations are accompanied not only with a detailed explanation, but also with R code that allows readers to replicate any aspect of what is being covered in the book using R, the open-source environment for statistical computing and graphics.