Overfitting Bayesian Mixture Models with an Unknown Number of Components

This paper proposes solutions to three issues pertaining to the estimation of finite mixture models with an unknown number of components: the non-identifiability induced by overfitting the number of components, the mixing limitations of standard Markov Chain Monte Carlo (MCMC) sampling techniques, and the related label switching problem. An overfitting approach is used to estimate the number of components in a finite mixture model via a Zmix algorithm. Zmix provides a bridge between multidimensional samplers and test based estimation methods, whereby priors are chosen to encourage extra groups to have weights approaching zero. MCMC sampling is made possible by the implementation of prior parallel tempering, an extension of parallel tempering. Zmix can accurately estimate the number of components, posterior parameter estimates and allocation probabilities given a sufficiently large sample size. The results will reflect uncertainty in the final model and will report the range of possible candidate models and their respective estimated probabilities from a single run. Label switching is resolved with a computationally light-weight method, Zswitch, developed for overfitted mixtures by exploiting the intuitiveness of allocation-based relabelling algorithms and the precision of label-invariant loss functions. Four simulation studies are included to illustrate Zmix and Zswitch, as well as three case studies from the literature. All methods are available as part of the R package Zmix, which can currently be applied to univariate Gaussian mixture models.

[1]  Wang,et al.  Replica Monte Carlo simulation of spin glasses. , 1986, Physical review letters.

[2]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[3]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[4]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[5]  K. Roeder Density estimation with confidence sets exemplified by superclusters and voids in the galaxies , 1990 .

[6]  Y. Bechtel,et al.  A population and family study N‐acetyltransferase using caffeine urinary metabolites , 1993, Clinical pharmacology and therapeutics.

[7]  J J Heckman,et al.  Econometric mixture models and more general models for unobservables in duration analysis , 1994, Statistical methods in medical research.

[8]  S. Crawford An Application of the Laplace Method to Finite Mixture Distributions , 1994 .

[9]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[10]  B. Carlin,et al.  Bayesian Model Choice Via Markov Chain Monte Carlo Methods , 1995 .

[11]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[12]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[13]  Walter R. Gilks,et al.  Bayesian model comparison via jump diffusions , 1995 .

[14]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[15]  S. Chib Marginal Likelihood from the Gibbs Output , 1995 .

[16]  Alan E. Gelfand,et al.  Model Determination using sampling-based methods , 1996 .

[17]  L. Wasserman,et al.  Practical Bayesian Density Estimation Using Mixtures of Normals , 1997 .

[18]  P. Green,et al.  Corrigendum: On Bayesian analysis of mixtures with an unknown number of components , 1997 .

[19]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[20]  Gilles Celeux,et al.  Bayesian Inference for Mixture: The Label Switching Problem , 1998, COMPSTAT.

[21]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[22]  M. Stephens Bayesian analysis of mixture models with an unknown number of components- an alternative to reversible jump methods , 2000 .

[23]  M. Stephens Dealing with label switching in mixture models , 2000 .

[24]  C. Robert,et al.  Computational and Inferential Difficulties with Mixture Posterior Distributions , 2000 .

[25]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[26]  M. Aitkin Likelihood and Bayesian analysis of mixtures , 2001 .

[27]  C. Robert,et al.  Estimating Mixtures of Regressions , 2003 .

[28]  Sandhya Dwarkadas,et al.  Parallel Metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inference , 2002, Bioinform..

[29]  Jean-Michel Marin,et al.  Bayesian Modelling and Inference on Mixtures of Distributions , 2005 .

[30]  Michael W Deem,et al.  Parallel tempering: theory, applications, and new perspectives. , 2005, Physical chemistry chemical physics : PCCP.

[31]  Ajay Jasra,et al.  Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modeling , 2005 .

[32]  Sylvia Frühwirth-Schnatter,et al.  Finite Mixture and Markov Switching Models , 2006 .

[33]  Sylvia Richardson,et al.  Statistical Applications in Genetics and Molecular Biology Fully Bayesian Mixture Model for Differential Gene Expression : Simulations and Model Checks , 2011 .

[34]  A. Nobile Bayesian finite mixtures: a note on prior specification and posterior computation , 2007, 0711.0458.

[35]  Agostino Nobile,et al.  Bayesian finite mixtures with an unknown number of components: The allocation sampler , 2007, Stat. Comput..

[36]  Tsung-I Lin,et al.  Finite mixture modelling using the skew normal distribution , 2007 .

[37]  Adelino R. Ferreira da Silva,et al.  A Dirichlet process mixture model for brain MRI tissue classification , 2007, Medical Image Anal..

[38]  P. Deb Finite Mixture Models , 2008 .

[39]  Friedrich Leisch,et al.  Dealing with label switching in mixture models under genuine multimodality , 2009, J. Multivar. Anal..

[40]  B. Lindsay,et al.  Bayesian Mixture Labeling by Highest Posterior Density , 2009 .

[41]  George Casella,et al.  A Short History of Markov Chain Monte Carlo: Subjective Recollections from Incomplete Data , 2008, 0808.2902.

[42]  K. Mengersen,et al.  Asymptotic behaviour of the posterior distribution in overfitted mixture models , 2011 .

[43]  Kerrie Mengersen,et al.  Probabilistic subgroup identification using Bayesian finite mixture modelling: A case study in Parkinson's disease phenotype identification , 2012, Statistical methods in medical research.

[44]  A. Pettitt,et al.  Recursive Pathways to Marginal Likelihood Estimation with Prior-Sensitivity Analysis , 2013, 1301.6450.

[45]  David B. Dunson,et al.  Bayesian data analysis, third edition , 2013 .

[46]  Denys Pommeret,et al.  Likelihood-free parallel tempering , 2011, Stat. Comput..