Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood

We propose an assessing method of mixture model in a cluster analysis setting with integrated completed likelihood. For this purpose, the observed data are assigned to unknown clusters using a maximum a posteriori operator. Then, the integrated completed likelihood (ICL) is approximated using the Bayesian information criterion (BIC). Numerical experiments on simulated and real data of the resulting ICL criterion show that it performs well both for choosing a mixture model and a relevant number of clusters. In particular, ICL appears to be more robust than BIC to violation of some of the mixture model assumptions and it can select a number of dusters leading to a sensible partitioning of the data.

[1]  C. S. Wallace,et al.  Unsupervised Learning Using MML , 1996, ICML.

[2]  G. Celeux,et al.  A Classification EM algorithm for clustering and two stochastic versions , 1992 .

[3]  G. Celeux,et al.  An entropy criterion for assessing the number of clusters in a mixture model , 1996 .

[4]  G. Govaert,et al.  Choosing models in model-based clustering and discriminant analysis , 1999 .

[5]  J. R. Koehler,et al.  Modern Applied Statistics with S-Plus. , 1996 .

[6]  L. Wasserman,et al.  A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz Criterion , 1995 .

[7]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[8]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[9]  Peter C. Cheeseman,et al.  Bayesian Classification (AutoClass): Theory and Results , 1996, Advances in Knowledge Discovery and Data Mining.

[10]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[11]  D. Rubin,et al.  Estimation and Hypothesis Testing in Finite Mixture Models , 1985 .

[12]  William D. Penny,et al.  Bayesian Approaches to Gaussian Mixture Modeling , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  C. Robert,et al.  Estimation of Finite Mixture Distributions Through Bayesian Sampling , 1994 .

[14]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[15]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[16]  Padhraic Smyth,et al.  Model selection for probabilistic clustering using cross-validated likelihood , 2000, Stat. Comput..

[17]  C. Robert Mixtures of Distributions: Inference and Estimation , 1996 .

[18]  O. Cordero-Braña,et al.  Minimum Hellinger Distance Estimation for Finite Mixture Models , 1996 .

[19]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[20]  Peter G. Bryant,et al.  Large-sample results for optimization-based clustering methods , 1991 .

[21]  Brian Everitt,et al.  An Introduction to Latent Variable Models , 1984 .

[22]  L. Wasserman,et al.  Practical Bayesian Density Estimation Using Mixtures of Normals , 1997 .

[23]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[24]  G. J. McLachlan,et al.  9 The classification and mixture maximum likelihood approaches to cluster analysis , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[25]  Gérard Govaert,et al.  Gaussian parsimonious clustering models , 1995, Pattern Recognit..