Polynomial Learning of Distribution Families

The question of polynomial learn ability of probability distributions, particularly Gaussian mixture distributions, has recently received significant attention in theoretical computer science and machine learning. However, despite major progress, the general question of polynomial learn ability of Gaussian mixture distributions still remained open. The current work resolves the question of polynomial learn ability for Gaussian mixtures in high dimension with an arbitrary fixed number of components. Specifically, we show that parameters of a Gaussian mixture distribution with fixed number of components can be learned using a sample whose size is polynomial in dimension and all other parameters. The result on learning Gaussian mixtures relies on an analysis of distributions belonging to what we call “polynomial families” in low dimension. These families are characterized by their moments being polynomial in parameters and include almost all common probability distributions as well as their mixtures and products. Using tools from real algebraic geometry, we show that parameters of any distribution belonging to such a family can be learned in polynomial time and using a polynomial number of sample points. The result on learning polynomial families is quite general and is of independent interest. To estimate parameters of a Gaussian mixture distribution in high dimensions, we provide a deterministic algorithm for dimensionality reduction. This allows us to reduce learning a high-dimensional mixture to a polynomial number of parameter estimations in low dimension. Combining this reduction with the results on polynomial families yields our result on learning arbitrary Gaussian mixtures in high dimensions.

[1]  Jon Feldman,et al.  PAC Learning Axis-Aligned Mixtures of Gaussians with No Separation Assumption , 2006, COLT.

[2]  Dimitris Achlioptas,et al.  On Spectral Learning of Mixtures of Distributions , 2005, COLT.

[3]  Seth Sullivant,et al.  Lectures on Algebraic Statistics , 2008 .

[4]  Mikhail Belkin,et al.  Polynomial Learning of Distribution Families , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[5]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[6]  Sanjoy Dasgupta,et al.  Learning Mixtures of Gaussians using the k-means Algorithm , 2009, ArXiv.

[7]  Sanjeev Arora,et al.  Learning mixtures of arbitrary gaussians , 2001, STOC '01.

[8]  Mikhail Belkin,et al.  Learning Gaussian Mixtures with Arbitrary Separation , 2009, ArXiv.

[9]  Santosh S. Vempala,et al.  A spectral algorithm for learning mixtures of distributions , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[10]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[11]  Satish Rao,et al.  Learning Mixtures of Product Distributions Using Correlations and Independence , 2008, COLT.

[12]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[13]  Ankur Moitra,et al.  Settling the Polynomial Learnability of Mixtures of Gaussians , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[14]  H. Teicher Identifiability of Finite Mixtures , 1963 .

[15]  J. Feldman,et al.  Learning mixtures of product distributions over discrete domains , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[16]  Satish Rao,et al.  Beyond Gaussians: Spectral Methods for Learning Mixtures of Heavy-Tailed Distributions , 2008, COLT.

[17]  Santosh S. Vempala,et al.  The Spectral Method for General Mixture Models , 2008, SIAM J. Comput..

[18]  Mikhail Belkin,et al.  Toward Learning Gaussian Mixtures with Arbitrary Separation , 2010, COLT.

[19]  Sanjoy Dasgupta,et al.  Learning mixtures of Gaussians , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[20]  Adam Tauman Kalai,et al.  Efficiently learning mixtures of two Gaussians , 2010, STOC '10.

[21]  Jon M. Kleinberg,et al.  On learning mixtures of heavy-tailed distributions , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[22]  S. Basu,et al.  Algorithms in real algebraic geometry , 2003 .

[23]  K. Pearson Contributions to the Mathematical Theory of Evolution , 1894 .

[24]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1951 .

[25]  B. Lindsay Mixture models : theory, geometry, and applications , 1995 .

[26]  Santosh S. Vempala,et al.  Isotropic PCA and Affine-Invariant Clustering , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[27]  Sanjoy Dasgupta,et al.  A Two-Round Variant of EM for Gaussian Mixtures , 2000, UAI.