Likelihood Estimation with Normal Mixture Models

We consider some of the problems associated with likelihood estimation in the context of a mixture of multivariate normal distributions. Unfortunately with mixture models, the likelihood equation usually has multiple roots and so there is the question of which root to choose. In the case of equal covariance matrices the choice of root is straightforward in the sense that the maximum likelihood estimator exists and is consistent. However, an example is presented to demonstrate that the adoption of a homoscedastic normal model in the presence of some heteroscedasticity can considerably influence the likelihood estimates, in particular of the mixing proportions, and hence the consequent clustering of the sample at hand.

[1]  A. Wald Note on the Consistency of the Maximum Likelihood Estimate , 1949 .

[2]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[3]  J. Kiefer,et al.  CONSISTENCY OF THE MAXIMUM LIKELIHOOD ESTIMATOR IN THE PRESENCE OF INFINITELY MANY INCIDENTAL PARAMETERS , 1956 .

[4]  N. E. Day Estimating the components of a mixture of normal distributions , 1969 .

[5]  J. Wolfe PATTERN CLUSTERING BY MULTIVARIATE MIXTURE ANALYSIS. , 1970, Multivariate behavioral research.

[6]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[7]  N. Kiefer Discrete Parameter Variation: Efficient Estimation of a Switching Regression Model , 1978 .

[8]  E. Lehmann Efficient Likelihood Estimators , 1980 .

[9]  N. Campbell Robust Procedures in Multivariate Analysis I: Robust Covariance Estimation , 1980 .

[10]  Douglas M. Hawkins,et al.  A new test for multivariate normality and homoscedasticity , 1981 .

[11]  R. Redner Note on the Consistency of the Maximum Likelihood Estimate for Nonidentifiable Distributions , 1981 .

[12]  B. Everitt,et al.  Finite Mixture Distributions , 1981 .

[13]  G. J. McLachlan,et al.  9 The classification and mixture maximum likelihood approaches to cluster analysis , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[14]  R. A. Boyles On the Convergence of the EM Algorithm , 1983 .

[15]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[16]  Brian Everitt,et al.  Maximum Likelihood Estimation of the Parameters in a Mixture of Two Univariate Normal Distributions; a Comparison of Different Algorithms , 1984 .