EM for mixtures

Maximum likelihood through the EM algorithm is widely used to estimate the parameters in hidden structure models such as Gaussian mixture models. But the EM algorithm has well-documented drawbacks: its solution could be highly dependent from its initial position and it may fail as a result of degeneracies. We stress the practical dangers of theses limitations and how carefully they should be dealt with. Our main conclusion is that no method enables to address them satisfactory in all situations. But improvements are introduced, first, using a penalized log-likelihood of Gaussian mixture models in a Bayesian regularization perspective and, second, choosing the best among several relevant initialisation strategies. In this perspective, we also propose new recursive initialization strategies which prove helpful. They are compared with standard initialization procedures through numerical experiments and their effects on model selection criteria are analyzed.

[1]  G. Celeux,et al.  A Classification EM algorithm for clustering and two stochastic versions , 1992 .

[2]  Adrian E. Raftery,et al.  Bayesian Regularization for Normal Mixture Estimation and Model-Based Clustering , 2007, J. Classif..

[3]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[4]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[5]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  L. Wasserman,et al.  Practical Bayesian Density Estimation Using Mixtures of Normals , 1997 .

[7]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[8]  Gérard Govaert,et al.  Gaussian parsimonious clustering models , 1995, Pattern Recognit..

[9]  Gilles Celeux,et al.  Co-expression analysis of high-throughput transcriptome sequencing data with Poisson mixture models , 2015, Bioinform..

[10]  G. McLachlan,et al.  The EM Algorithm and Extensions: Second Edition , 2008 .

[11]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[12]  Cathy Maugis,et al.  On the estimation of mixtures of Poisson regression models with large number of components , 2016, Comput. Stat. Data Anal..

[13]  Adrian E. Raftery,et al.  Incremental Model-Based Clustering for Large Datasets With Small Clusters , 2005 .

[14]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[15]  André Berchtold,et al.  Optimization of Mixture Models: Comparison of Different Strategies , 2004, Comput. Stat..

[16]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[17]  Christophe Biernacki,et al.  Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models , 2003, Comput. Stat. Data Anal..

[18]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[19]  P. Massart,et al.  Minimal Penalties for Gaussian Model Selection , 2007 .

[20]  Alyssa C. Frazee,et al.  ReCount: A multi-experiment resource of analysis-ready RNA-seq gene count datasets , 2011, BMC Bioinformatics.

[21]  Bertrand Michel,et al.  Slope heuristics: overview and implementation , 2011, Statistics and Computing.

[22]  J. Idier,et al.  Penalized Maximum Likelihood Estimator for Normal Mixtures , 2000 .

[23]  Peter J. Bickel,et al.  The Developmental Transcriptome of Drosophila melanogaster , 2010, Nature.

[24]  Jean-Patrick Baudry Sélection de modèle pour la classification non supervisée , 2009 .