Convex Approximation to the Integral Mixture Models Using Step Functions

The parameter estimation to mixture models has been shown as a local optimal solution for decades. In this paper, we propose a functional estimation to mixture models using step functions. We show that the proposed functional inference yields a convex formulation and consequently the mixture models are feasible for a global optimum inference. The proposed approach further unifies the existing isolated exemplar-based clustering techniques at a higher level of generality, e.g. it provides a theoretical justification for the heuristics of the clustering by affinity propagation Frey & Dueck (2007), it reproduces Lashkari & Golland (2007)'s's convex formulation as a special case under this step function construction. Empirical studies also verify the theoretic justifications.

[1]  J. Schafer,et al.  Difficulties in Drawing Inferences With Finite-Mixture Models , 2004 .

[2]  Polina Golland,et al.  Convex Clustering with Exemplar-Based Models , 2007, NIPS.

[3]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[4]  Adam Tauman Kalai,et al.  Efficiently learning mixtures of two Gaussians , 2010, STOC '10.

[5]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[6]  Brendan J. Frey,et al.  Mixture Modeling by Affinity Propagation , 2005, NIPS.

[7]  R. Baierlein Probability Theory: The Logic of Science , 2004 .

[8]  Sanjoy Dasgupta,et al.  Learning mixtures of Gaussians , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[9]  Radford M. Neal,et al.  A Split-Merge Markov chain Monte Carlo Procedure for the Dirichlet Process Mixture Model , 2004 .

[10]  D. S. Young An Overview of Mixture Models , 2008, 0808.0383.

[11]  Moritz Hardt,et al.  Tight Bounds for Learning a Mixture of Two Gaussians , 2014, STOC.

[12]  Ankur Moitra,et al.  Settling the Polynomial Learnability of Mixtures of Gaussians , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[13]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[14]  Anthony K. H. Tung,et al.  Estimating local optimums in EM algorithm over Gaussian mixture model , 2008, ICML '08.

[15]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[16]  Sebastian Nowozin,et al.  A decoupled approach to exemplar-based unsupervised learning , 2008, ICML '08.

[17]  Maria Gheorghe,et al.  Non-parametric Bayesian networks for parameter estimation in reservoir simulation: a graphical take on the ensemble Kalman filter (part I) , 2013, Computational Geosciences.

[18]  M. Stephens Dealing with label switching in mixture models , 2000 .

[19]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[20]  Jon Feldman,et al.  PAC Learning Axis-Aligned Mixtures of Gaussians with No Separation Assumption , 2006, COLT.

[21]  Michael I. Jordan,et al.  Revisiting k-means: New Algorithms via Bayesian Nonparametrics , 2011, ICML.

[22]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[23]  D. Rubin,et al.  The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence , 1994 .

[24]  Sanjoy Dasgupta,et al.  A Two-Round Variant of EM for Gaussian Mixtures , 2000, UAI.

[25]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[26]  S. Portnoy Asymptotic Behavior of Likelihood Methods for Exponential Families when the Number of Parameters Tends to Infinity , 1988 .

[27]  Moritz Hardt,et al.  Sharp bounds for learning a mixture of two gaussians , 2014, ArXiv.

[28]  M. Stephens Bayesian analysis of mixture models with an unknown number of components- an alternative to reversible jump methods , 2000 .