Gaussian Mixture Modeling by Exploiting the Mahalanobis Distance

In this paper, the expectation-maximization (EM) algorithm for Gaussian mixture modeling is improved via three statistical tests. The first test is a multivariate normality criterion based on the Mahalanobis distance of a sample measurement vector from a certain Gaussian component center. The first test is used in order to derive a decision whether to split a component into another two or not. The second test is a central tendency criterion based on the observation that multivariate kurtosis becomes large if the component to be split is a mixture of two or more underlying Gaussian sources with common centers. If the common center hypothesis is true, the component is split into two new components and their centers are initialized by the center of the (old) component candidate for splitting. Otherwise, the splitting is accomplished by a discriminant derived by the third test. This test is based on marginal cumulative distribution functions. Experimental results are presented against seven other EM variants both on artificially generated data-sets and real ones. The experimental results demonstrate that the proposed EM variant has an increased capability to find the underlying model, while maintaining a low execution time.

[1]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[2]  Emmanuel Lesaffre,et al.  Normality tests and transformations , 1983, Pattern Recognit. Lett..

[3]  Maria Huhtala,et al.  Random Variables and Stochastic Processes , 2021, Matrix and Tensor Decompositions in Signal Processing.

[4]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Gilles Celeux,et al.  A Component-Wise EM Algorithm for Mixtures , 2001, 1201.5913.

[6]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[7]  James A. Koziol,et al.  A class of invariant procedures for assessing multivariate normality , 1982 .

[8]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Aarnout Brombacher,et al.  Probability... , 2009, Qual. Reliab. Eng. Int..

[10]  Nikos Vlassis,et al.  A multivariate kurtosis-based approach to Gaussian misture modeling , 2000 .

[11]  Padhraic Smyth,et al.  Model selection for probabilistic clustering using cross-validated likelihood , 2000, Stat. Comput..

[12]  Richard B. Darlington,et al.  Is Kurtosis Really “Peakedness?” , 1970 .

[13]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[14]  G. Celeux,et al.  An entropy criterion for assessing the number of clusters in a mixture model , 1996 .

[15]  J. Wishart THE GENERALISED PRODUCT MOMENT DISTRIBUTION IN SAMPLES FROM A NORMAL MULTIVARIATE POPULATION , 1928 .

[16]  Nikos A. Vlassis,et al.  A kurtosis-based dynamic approach to Gaussian mixture modeling , 1999, IEEE Trans. Syst. Man Cybern. Part A.

[17]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[18]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[19]  Changshui Zhang,et al.  Competitive EM algorithm for finite mixture models , 2004, Pattern Recognit..

[20]  R. F.,et al.  Mathematical Statistics , 1944, Nature.

[21]  H. Hotelling The Generalization of Student’s Ratio , 1931 .

[22]  Constantine Kotropoulos,et al.  Phonemic segmentation using the generalised Gamma distribution and small sample Bayesian information criterion , 2008, Speech Commun..

[23]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[24]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[25]  Ben J. A. Kröse,et al.  Efficient Greedy Learning of Gaussian Mixture Models , 2003, Neural Computation.

[26]  Constantine Kotropoulos,et al.  Emotional Speech Classification Using Gaussian Mixture Models and the Sequential Floating Forward Selection Algorithm , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[27]  Naonori Ueda,et al.  Deterministic annealing EM algorithm , 1998, Neural Networks.

[28]  K. Mardia Measures of multivariate skewness and kurtosis with applications , 1970 .

[29]  James A. Koziol,et al.  On Assessing Multivariate Normality , 1983 .

[30]  Anil K. Jain,et al.  Simultaneous feature selection and clustering using mixture models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[32]  M. Evans,et al.  Statistical Distributions, Third Edition , 2001 .

[33]  M. Evans Statistical Distributions , 2000 .

[34]  J. Wade Davis,et al.  Statistical Pattern Recognition , 2003, Technometrics.

[35]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .

[36]  Naonori Ueda,et al.  EM algorithm with split and merge operations for mixture models , 2000, Systems and Computers in Japan.

[37]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[38]  H. Akaike A new look at the statistical model identification , 1974 .

[39]  John H. L. Hansen,et al.  Feature analysis and neural network-based classification of speech under stress , 1996, IEEE Trans. Speech Audio Process..

[40]  Gérard Govaert,et al.  An improvement of the NEC criterion for assessing the number of clusters in a mixture model , 1999, Pattern Recognit. Lett..

[41]  G. McLachlan On Bootstrapping the Likelihood Ratio Test Statistic for the Number of Components in a Normal Mixture , 1987 .