Isotropic PCA and Affine-Invariant Clustering

We present an extension of principal component analysis (PCA) and a new algorithm for clustering points in \Rn based on it. The key property of the algorithm is that it is affine-invariant. When the input is a sample from a mixture of two arbitrary Gaussians, the algorithm correctly classifies the sample assuming only that the two components are separable by a hyperplane, i.e., there exists a halfspace that contains most of one Gaussian and almost none of the other in probability mass. This is nearly the best possible, improving known results substantially. For k>2 components, the algorithm requires only that there be some (k-1)-dimensional subspace in which the ``overlap'' in every direction is small. Our main tools are isotropic transformation, spectral projection and a simple reweighting technique. We call this combination isotropic PCA.

[1]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[2]  David G. Stork,et al.  Pattern Classification , 1973 .

[3]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[4]  G. Stewart,et al.  Matrix Perturbation Theory , 1990 .

[5]  M. Rudelson Random Vectors in the Isotropic Position , 1996, math/9608208.

[6]  Sanjoy Dasgupta,et al.  Learning mixtures of Gaussians , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[7]  Sanjoy Dasgupta,et al.  A Two-Round Variant of EM for Gaussian Mixtures , 2000, UAI.

[8]  Sanjeev Arora,et al.  Learning mixtures of arbitrary gaussians , 2001, STOC '01.

[9]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[10]  Santosh S. Vempala,et al.  A spectral algorithm for learning mixtures of distributions , 2002, The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings..

[11]  Sanjeev Arora,et al.  LEARNING MIXTURES OF SEPARATED NONSPHERICAL GAUSSIANS , 2005, math/0503457.

[12]  Dimitris Achlioptas,et al.  On Spectral Learning of Mixtures of Distributions , 2005, COLT.

[13]  Jon M. Kleinberg,et al.  On learning mixtures of heavy-tailed distributions , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[14]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[15]  Jon Feldman,et al.  PAC Learning Axis-Aligned Mixtures of Gaussians with No Separation Assumption , 2006, COLT.

[16]  Mark Rudelson,et al.  Sampling from large matrices: An approach through geometric functional analysis , 2005, JACM.

[17]  Santosh S. Vempala,et al.  The geometry of logconcave functions and sampling algorithms , 2007, Random Struct. Algorithms.

[18]  S. Vempala,et al.  The geometry of logconcave functions and sampling algorithms , 2007 .

[19]  Santosh S. Vempala,et al.  The Spectral Method for General Mixture Models , 2008, SIAM J. Comput..

[20]  Satish Rao,et al.  Beyond Gaussians: Spectral Methods for Learning Mixtures of Heavy-Tailed Distributions , 2008, COLT.

[21]  Satish Rao,et al.  Learning Mixtures of Product Distributions Using Correlations and Independence , 2008, COLT.