Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering

We consider the problem of Gaussian mixture clustering in the high-dimensional limit where the data consists of m points in n dimensions, n,m → ∞ and α = m/n stays finite. Using exact but non-rigorous methods from statistical physics, we determine the critical value of α and the distance between the clusters at which it becomes information-theoretically possible to reconstruct the membership into clusters better than chance. We also determine the accuracy achievable by the Bayes-optimal estimation algorithm. In particular, we find that when the number of clusters is sufficiently large, r > 4+2√α, there is a gap between the threshold for information-theoretically optimal performance and the threshold at which known algorithms succeed.

[1]  Sundeep Rangan,et al.  Iterative estimation of constrained rank-one matrices in noise , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[2]  Andrea Montanari,et al.  Asymptotic mutual information for the binary stochastic block model , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[3]  Florent Krzakala,et al.  Statistical physics of inference: thresholds and algorithms , 2015, ArXiv.

[4]  M. Rattray,et al.  Principal-component-analysis eigenvalue spectra from data with symmetry-breaking structure. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Andrea Montanari,et al.  Information-theoretically optimal sparse PCA , 2014, 2014 IEEE International Symposium on Information Theory.

[6]  Nicolas Macris,et al.  Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula , 2016, NIPS.

[7]  R. Palmer,et al.  Solution of 'Solvable model of a spin glass' , 1977 .

[8]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[9]  Florent Krzakala,et al.  Mutual information in rank-one matrix estimation , 2016, 2016 IEEE Information Theory Workshop (ITW).

[10]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[11]  Toshiyuki Tanaka,et al.  Low-rank matrix reconstruction and clustering via approximate message passing , 2013, NIPS.

[12]  Sompolinsky,et al.  Statistical mechanics of the maximum-likelihood density estimation. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[13]  Florent Krzakala,et al.  MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[14]  M. Mézard,et al.  Information, Physics, and Computation , 2009 .

[15]  J. Nadal,et al.  Optimal unsupervised learning , 1994 .

[16]  S. Péché,et al.  Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices , 2004, math/0403022.

[17]  Andrea Montanari,et al.  The dynamics of message passing on dense graphs, with applications to compressed sensing , 2010, 2010 IEEE International Symposium on Information Theory.

[18]  Michael Biehl,et al.  Statistical mechanics of unsupervised structure recognition , 1994 .

[19]  M. B. Gordon,et al.  PHASE TRANSITIONS IN OPTIMAL UNSUPERVISED LEARNING , 1997, cond-mat/9709274.

[20]  Adel Javanmard,et al.  State Evolution for General Approximate Message Passing Algorithms, with Applications to Spatial Coupling , 2012, ArXiv.

[21]  Andrea Montanari,et al.  Universality in Polytope Phase Transitions and Message Passing Algorithms , 2012, ArXiv.

[22]  Satish Babu Korada,et al.  Exact Solution of the Gauge Symmetric p-Spin Glass Model on a Complete Graph , 2009 .

[23]  Jess Banks,et al.  Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization , 2016, 2017 IEEE International Symposium on Information Theory (ISIT).