Flexible Bayesian modelling of concomitant covariate effects in mixture models

Mixture models provide a useful tool to account for unobserved heterogeneity, and are the basis of many model-based clustering methods. In order to gain additional flexibility, some model parameters can be expressed as functions of concomitant covariates. In particular, prior probabilities of latent group membership can be linked to concomitant covariates through a multinomial logistic regression model, where each of these so-called component weights is associated with a linear predictor involving one or more of these variables. In this Thesis, this approach is extended by replacing the linear predictors with additive ones, where the contributions of some/all concomitant covariates can be represented by smooth functions. An estimation procedure within the Bayesian paradigm is proposed. In particular, a data augmentation scheme based on difference random utility models is exploited, and smoothness of the covariate effects is controlled by suitable choices for the prior distributions of the spline coefficients. This methodology is then extended to include flexible covariates effects also on the component densities. The performance of the proposed methodologies is investigated via simulation experiments and applications to real data. The content of the Thesis is organized as follows. In Chapter 1, a literature review about mixture models and mixture models with covariate effects is provided. After a brief introduction on Bayesian additive models with P-splines, the general specification for the proposed method is presented in Chapter 2, together with the associated Bayesian inference procedure. This approach is adapted to the specific case of categorical and continuous manifest variables in Chapter 3 and Chapter 4, respectively. In Chapter 5, the proposed methodology is extended to include flexible covariate effects also in the component densities. Finally, conclusions and remarks on the Thesis are collected in Chapter 6.

[1]  A. Cohen,et al.  Finite Mixture Distributions , 1982 .

[2]  I. C. Gormley,et al.  A mixture of experts latent position cluster model for social network data , 2010 .

[3]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[4]  T. B. Murphy,et al.  Gaussian parsimonious clustering models with covariates and a noise component , 2017, Adv. Data Anal. Classif..

[5]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[6]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[7]  Rebecca Nugent,et al.  sARI: a soft agreement measure for class partitions incorporating assignment probabilities , 2018, Adv. Data Anal. Classif..

[8]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[9]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[10]  Alan Y. Chiang,et al.  Generalized Additive Models: An Introduction With R , 2007, Technometrics.

[11]  M. Newton,et al.  Estimating the Integrated Likelihood via Posterior Simulation Using the Harmonic Mean Identity , 2006 .

[12]  I. C. Gormley,et al.  A mixture of experts model for rank data with applications in election studies , 2008, 0901.4203.

[13]  I. C. Gormley,et al.  Mixture of Experts Modelling with Social Science Applications , 2011 .

[14]  S. Frühwirth-Schnatter,et al.  Labor Market Entry and Earnings Dynamics: Bayesian Inference Using Mixtures-of-Experts Markov Chain Clustering , 2012 .

[15]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[17]  Andreas Brezger,et al.  Generalized structured additive regression based on Bayesian P-splines , 2006, Comput. Stat. Data Anal..

[18]  George B. Macready,et al.  Concomitant-Variable Latent-Class Models , 1988 .

[19]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[20]  Gertraud Malsiner-Walli,et al.  Model-based clustering based on sparse finite Gaussian mixtures , 2014, Statistics and Computing.

[21]  I. C. Gormley,et al.  Mixtures of Experts Models , 2018, 1806.08200.

[22]  R. Quandt A New Approach to Estimating Switching Regressions , 1972 .

[23]  P. Deb Finite Mixture Models , 2008 .

[24]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[25]  Geoffrey J. McLachlan,et al.  Laplace mixture of linear experts , 2016, Comput. Stat. Data Anal..

[26]  Dankmar Böhning,et al.  Computer-Assisted Analysis of Mixtures and Applications: Meta-Analysis, Disease Mapping, and Others , 1999 .

[27]  Chris Hanretty Areal interpolation and the UK's referendum on EU membership , 2017 .

[28]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[29]  B. Lindsay Mixture models : theory, geometry, and applications , 1995 .

[30]  M. Puterman,et al.  Mixed Poisson regression models with covariate dependent rates. , 1996, Biometrics.

[31]  Faicel Chamroukhi,et al.  Non-Normal Mixtures of Experts , 2015, ArXiv.

[32]  S. Frühwirth-Schnatter,et al.  Data Augmentation and MCMC for Binary and Multinomial Logit Models , 2010 .

[33]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[34]  A. Qu,et al.  Mixture Modeling for Longitudinal Data , 2016 .

[35]  P. McNicholas Mixture Model-Based Classification , 2016 .

[36]  S. Fienberg,et al.  DESCRIBING DISABILITY THROUGH INDIVIDUAL-LEVEL MIXTURE MODELS FOR MULTIVARIATE BINARY DATA. , 2007, The annals of applied statistics.

[37]  L. Tardella,et al.  Bayesian Plackett–Luce Mixture Models for Partially Ranked Data , 2015, Psychometrika.

[38]  Sylvia Frühwirth-Schnatter,et al.  Finite Mixture and Markov Switching Models , 2006 .

[39]  S. Lang,et al.  Bayesian P-Splines , 2004 .

[40]  W. DeSarbo,et al.  A maximum likelihood methodology for clusterwise linear regression , 1988 .

[41]  Kerrie Mengersen,et al.  Mixtures: Estimation and Applications , 2011 .

[42]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[43]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .