The BYY annealing learning algorithm for Gaussian mixture with automated model selection

Bayesian Ying-Yang (BYY) learning has provided a new mechanism that makes parameter learning with automated model selection via maximizing a harmony function on a backward architecture of the BYY system for the Gaussian mixture. However, since there are a large number of local maxima for the harmony function, any local searching algorithm, such as the hard-cut EM algorithm, does not work well. In order to overcome this difficulty, we propose a simulated annealing learning algorithm to search the global maximum of the harmony function, being expressed as a kind of deterministic annealing EM procedure. It is demonstrated by the simulation experiments that this BYY annealing learning algorithm can efficiently and automatically determine the number of clusters or Gaussians during the learning process. Moreover, the BYY annealing learning algorithm is successfully applied to two real-life data sets, including Iris data classification and unsupervised color image segmentation.

[1]  J. Hartigan Distribution Problems in Clustering , 1977 .

[2]  Lei Xu,et al.  Bayesian Ying-Yang machine, clustering and number of clusters , 1997, Pattern Recognit. Lett..

[3]  H. Akaike A new look at the statistical model identification , 1974 .

[4]  Erkki Oja,et al.  Rival penalized competitive learning for clustering analysis, RBF net, and curve detection , 1993, IEEE Trans. Neural Networks.

[5]  Jinwen Ma,et al.  A gradient BYY harmony learning rule on Gaussian mixture with automated model selection , 2004, Neurocomputing.

[6]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[7]  H. Bozdogan Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions , 1987 .

[8]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[9]  Lei Xu,et al.  BYY harmony learning, structural RPCL, and topological self-organizing on mixture models , 2002, Neural Networks.

[10]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[11]  Yang Wang,et al.  Conjugate and natural gradient rules for BYY harmony learning on Gaussian mixture with automated model selection , 2005, Int. J. Pattern Recognit. Artif. Intell..

[12]  Stephen J. Roberts,et al.  Maximum certainty data partitioning , 2000, Pattern Recognit..

[13]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[14]  Lei Xu,et al.  Best Harmony, Unified RPCL and Automated Model Selection for Unsupervised and Supervised Learning on Gaussian Mixtures, Three-Layer Nets and ME-RBF-SVM Models , 2001, Int. J. Neural Syst..

[15]  Robert F. Ling,et al.  Classification and Clustering. , 1979 .

[16]  N. Boujemaa Generalized competitive clustering for image segmentation , 2000, PeachFuzz 2000. 19th International Conference of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.00TH8500).

[17]  Naonori Ueda,et al.  Deterministic annealing EM algorithm , 1998, Neural Networks.

[18]  K. Rose Deterministic annealing for clustering, compression, classification, regression, and related optimization problems , 1998, Proc. IEEE.

[19]  Joachim M. Buhmann,et al.  Pairwise Data Clustering by Deterministic Annealing , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[21]  P. Tavan,et al.  Deterministic annealing for density estimation by multivariate normal mixtures , 1997 .

[22]  Jinwen Ma,et al.  BYY Harmony Learning on Finite Mixture: Adaptive Gradient Implementation and A Floating RPCL Mechanism , 2006, Neural Processing Letters.

[23]  Pierre A. Devijver Pattern recognition , 1982 .

[24]  Naonori Ueda,et al.  Bayesian model search for mixture models based on optimizing variational bounds , 2002, Neural Networks.