Maximum weighted likelihood via rival penalized EM for density mixture clustering with automatic model selection

Expectation-maximization (EM) algorithm (A.P. Dempster et al., 1977) has been extensively used in density mixture clustering problems, but it is unable to-perform model selection automatically. This paper, therefore, proposes to learn the model parameters via maximizing a weighted likelihood. Under a specific weight design, we give out a rival penalized expectation-maximization (RPEM) algorithm, which makes the components in a density mixture compete each other at each time step. Not only are the associated parameters of the winner updated to adapt to an input, but also all rivals' parameters are penalized with the strength proportional to the corresponding posterior density probabilities. Compared to the EM algorithm (A.P. Dempster et al., 1977), the RPEM is able to fade out the redundant densities from a density mixture during the learning process. Hence, it can automatically select an appropriate number of densities in density mixture clustering. We experimentally demonstrate its outstanding performance on Gaussian mixtures and color image segmentation problem. Moreover, a simplified version of RPEM generalizes our recently proposed RPCCL algorithm (Y.M. Cheung, 2002) so that it is applicable to elliptical clusters as well with any input proportion. Compared to the existing heuristic RPCL (L. Xu et al., 1993) and its variants, this generalized RPCCL (G-RPCCL) circumvents the difficult preselection of the so-called delearning rate. Additionally, a special setting of the G-RPCCL not only degenerates to RPCL and its Type A variant, but also gives a guidance to choose an appropriate delearning rate for them. Subsequently, we propose a stochastic version of RPCL and its type A variant, respectively, in which the difficult selection problem of delearning rate has been novelly circumvented. The experiments show the promising results of this stochastic implementation.

[1]  Stanley C. Ahalt,et al.  Competitive learning algorithms for vector quantization , 1990, Neural Networks.

[2]  Hamparsum Bozdogan,et al.  Mixture-Model Cluster Analysis Using Model Selection Criteria and a New Informational Measure of Complexity , 1994 .

[3]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[4]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[5]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[6]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[7]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[8]  D. N. Geary Mixture Models: Inference and Applications to Clustering , 1989 .

[9]  Yu-ming Cheung Rival penalization controlled competitive learning for data clustering with unknown cluster number , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[10]  Sang Uk Lee,et al.  On the color image segmentation algorithm based on the thresholding and the fuzzy c-means techniques , 1990, Pattern Recognit..

[11]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[12]  Gregory Piatetsky-Shapiro,et al.  Advances in Knowledge Discovery and Data Mining , 2004, Lecture Notes in Computer Science.

[13]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[14]  Michael A. Arbib,et al.  Color Image Segmentation using Competitive Learning , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Erkki Oja,et al.  Rival penalized competitive learning for clustering analysis, RBF net, and curve detection , 1993, IEEE Trans. Neural Networks.

[16]  Lei Xu,et al.  Rival penalized competitive learning, finite mixture, and multisets clustering , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[17]  Bernd Fritzke,et al.  The LBG-U Method for Vector Quantization – an Improvement over LBG Inspired from Neural Networks , 1997, Neural Processing Letters.

[18]  H. Akaike A new look at the statistical model identification , 1974 .

[19]  Yiu-ming Cheung,et al.  k*-Means: A new generalized k-means clustering algorithm , 2003, Pattern Recognit. Lett..

[20]  H. Bozdogan Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions , 1987 .

[21]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[22]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[23]  Lei Xu,et al.  Bayesian Ying-Yang machine, clustering and number of clusters , 1997, Pattern Recognit. Lett..

[24]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..