Choosing models in model-based clustering and discriminant analysis

Using an eigenvalue decomposition of variance matrices, Celeux and Govaert (1993) obtained numerous and powerful models for Gaussian model-based clustering and discriminant analysis. Through Monte Carlo simulations, we compare the performances of many classical criteria to select these models: information criteria as AIC, the Bayesian criterion BIC, classification criteria as NEC and cross-validation. In the clustering context, information criteria and BIC outperform the classification criteria. In the discriminant analysis context, cross-validation shows good performance but information criteria and BIC give satisfactory results as well with, by far, less time-computing.

[1]  J. Wolfe A Monte Carlo Study of the Sampling Distribution of the Likelihood Ratio for Mixtures of Multinormal Distributions , 1971 .

[2]  H. Akaike A new look at the statistical model identification , 1974 .

[3]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[4]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[5]  G. J. McLachlan,et al.  9 The classification and mixture maximum likelihood approaches to cluster analysis , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[6]  D. Rubin,et al.  Estimation and Hypothesis Testing in Finite Mixture Models , 1985 .

[7]  R. Hathaway Another interpretation of the EM algorithm for mixture distributions , 1986 .

[8]  H. Bozdogan Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions , 1987 .

[9]  H. Bozdogan On the information-based measure of covariance complexity and its application to the evaluation of multivariate linear models , 1990 .

[10]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[11]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[12]  Bernhard W. Flury,et al.  Error rates in quadratic discrimination with constraints on the covariance matrices , 1994 .

[13]  Anil K. Jain,et al.  Neural networks and pattern recognition , 1994 .

[14]  M. P. Windham,et al.  Information-Based Validity Functionals for Mixture Analysis , 1994 .

[15]  Adrian E. Raftery,et al.  Hypothesis Testing and Model Selection Via Posterior Simulation , 1995 .

[16]  Gérard Govaert,et al.  Gaussian parsimonious clustering models , 1995, Pattern Recognit..

[17]  G. Celeux,et al.  Regularized Gaussian Discriminant Analysis through Eigenvalue Decomposition , 1996 .

[18]  G. Celeux,et al.  An entropy criterion for assessing the number of clusters in a mixture model , 1996 .

[19]  Adrian E. Raftery,et al.  Inference in model-based cluster analysis , 1997, Stat. Comput..

[20]  ModelsG,et al.  On a Resampling Approach to Choosing the Number of Componentsin Normal Mixture , 1997 .

[21]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[22]  Omid Omidvar,et al.  Neural Networks and Pattern Recognition , 1997 .

[23]  Christophe Biernacki Choix de modèles en classification , 1997 .