Resolution-Based Complexity Control for Gaussian Mixture Models

In the domain of unsupervised learning, mixtures of gaussians have become a popular tool for statistical modeling. For this class of generative models, we present a complexity control scheme, which provides an effective means for avoiding the problem of overfitting usually encountered with unconstrained (mixtures of) gaussians in high dimensions. According to some prespecified level of resolution as implied by a fixed variance noise model, the scheme provides an automatic selection of the dimensionalities of some local signal subspaces by maximum likelihood estimation. Together with a resolution-based control scheme for adjusting the number of mixture components, we arrive at an incremental model refinement procedure within a common deterministic annealing framework, which enables an efficient exploration of the model space. The advantages of the resolution-based framework are illustrated by experimental results on synthetic and high-dimensional real-world data.

[1]  Rose,et al.  Statistical mechanics and phase transitions in clustering. , 1990, Physical review letters.

[2]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  B. Everitt,et al.  Cluster Analysis (2nd ed). , 1982 .

[4]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[5]  Chris Fraley,et al.  Algorithms for Model-Based Gaussian Hierarchical Clustering , 1998, SIAM J. Sci. Comput..

[6]  H. Prosper Bayesian Analysis , 2000, hep-ph/0006356.

[7]  S. Geman,et al.  Nonparametric Maximum Likelihood Estimation by the Method of Sieves , 1982 .

[8]  Christopher M. Bishop,et al.  Bayesian PCA , 1998, NIPS.

[9]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[10]  Richard Szeliski,et al.  An Analysis of the Elastic Net Approach to the Traveling Salesman Problem , 1989, Neural Computation.

[11]  Joachim M. Buhmann,et al.  Unsupervised Texture Segmentation in a Deterministic Annealing Framework , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Geoffrey E. Hinton,et al.  Recognizing Handwritten Digits Using Mixtures of Linear Models , 1994, NIPS.

[13]  Joachim M. Buhmann,et al.  Grosser Systeme Echtzeitoptimierung Schwerpunktprogramm Der Deutschen Forschungsgemeinschaft Empirical Risk Approximation: an Induction Principle for Unsupervised Learning , 2022 .

[14]  Mike Alder,et al.  Initializing the EM Algorithm for use in Gaussian Mixture Modelling , 1993 .

[15]  Michael E. Tipping,et al.  Mixtures of Principal Component Analysers , 1997 .

[16]  Volker Tresp,et al.  Averaging, maximum penalized likelihood and Bayesian estimation for improving Gaussian mixture probability density estimates , 1998, IEEE Trans. Neural Networks.

[17]  M. Spivak A comprehensive introduction to differential geometry , 1979 .

[18]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[19]  James Kelly,et al.  AutoClass: A Bayesian Classification System , 1993, ML.

[20]  Zoubin Ghahramani,et al.  A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.

[21]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[22]  T. W. Anderson ASYMPTOTIC THEORY FOR PRINCIPAL COMPONENT ANALYSIS , 1963 .

[23]  Nanda Kambhatla,et al.  Fast Non-Linear Dimension Reduction , 1993, NIPS.

[24]  Patrice Y. Simard,et al.  Learning Prototype Models for Tangent Distance , 1994, NIPS.

[25]  Yair Weiss,et al.  Phase Transitions and the Perceptual Organization of Video Sequences , 1997, NIPS.

[26]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[27]  Geoffrey E. Hinton,et al.  The EM algorithm for mixtures of factor analyzers , 1996 .

[28]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[30]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[31]  Stephen M. Omohundro,et al.  Surface Learning with Applications to Lipreading , 1993, NIPS.

[32]  Geoffrey C. Fox,et al.  Vector quantization by deterministic annealing , 1992, IEEE Trans. Inf. Theory.

[33]  Brian Everitt,et al.  Cluster analysis , 1974 .

[34]  William D. Penny,et al.  Bayesian Approaches to Gaussian Mixture Modeling , 1998, IEEE Trans. Pattern Anal. Mach. Intell..