MEAN FIELD ASYMPTOTICS IN HIGH-DIMENSIONAL STATISTICS: FROM EXACT RESULTS TO EFFICIENT ALGORITHMS

Modern data analysis challenges require building complex statistical models with massive numbers of parameters. It is nowadays commonplace to learn models with millions of parameters by using iterative optimization algorithms. What are typical properties of the estimated models? In some cases, the high-dimensional limit of a statistical estimator is analogous to the thermodynamic limit of a certain (disordered) statistical mechanics system. Building on mathematical ideas from the mean-field theory of disordered systems, exact asymptotics can be computed for high-dimensional statistical learning problems. This theory suggests new practical algorithms and new procedures for statistical inference. Also, it leads to intriguing conjectures about the fundamental computational limits for statistical estimation.

[1]  Nicolas Macris,et al.  Mutual Information and Optimality of Approximate Message-Passing in Random Linear Estimation , 2017, IEEE Transactions on Information Theory.

[2]  Andrea Montanari,et al.  State Evolution for Approximate Message Passing with Non-Separable Functions , 2017, Information and Inference: A Journal of the IMA.

[3]  P. Delamoye Landau Theory Of Phase Transitions The Application To Structural Incommensurate Magnetic And Liquid Crystal Systems World Scientific Lecture Notes In Physics , 2019 .

[4]  Adel Javanmard,et al.  Debiasing the lasso: Optimal sample size for Gaussian designs , 2015, The Annals of Statistics.

[5]  Emmanuel Abbe,et al.  Community detection and stochastic block models: recent developments , 2017, Found. Trends Commun. Inf. Theory.

[6]  Léo Miolane Fundamental limits of low-rank matrix estimation , 2017, 1702.00473.

[7]  Florent Krzakala,et al.  Constrained low-rank matrix estimation: phase transitions, approximate message passing and applications , 2017, ArXiv.

[8]  Marc Lelarge,et al.  Fundamental limits of symmetric low-rank matrix estimation , 2016, Probability Theory and Related Fields.

[9]  Santosh S. Vempala,et al.  Statistical Algorithms and a Lower Bound for Detecting Planted Cliques , 2012, J. ACM.

[10]  A. Montanari,et al.  Asymptotic mutual information for the balanced binary stochastic block model , 2016 .

[11]  Sundeep Rangan,et al.  Vector approximate message passing for the generalized linear model , 2016, 2016 50th Asilomar Conference on Signals, Systems and Computers.

[12]  Galen Reeves,et al.  The replica-symmetric prediction for compressed sensing with Gaussian matrices is exact , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[13]  Nicolas Macris,et al.  Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula , 2016, NIPS.

[14]  Ramji Venkataramanan,et al.  Finite-sample analysis of Approximate Message Passing , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[15]  Pravesh Kothari,et al.  A Nearly Tight Sum-of-Squares Lower Bound for the Planted Clique Problem , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[16]  Adel Javanmard,et al.  Phase transitions in semidefinite relaxations , 2015, Proceedings of the National Academy of Sciences.

[17]  Andrea Montanari,et al.  Semidefinite programs on sparse random graphs and their application to community detection , 2015, STOC.

[18]  Kenneth D Harris,et al.  Spike sorting for large, dense electrode arrays , 2015, Nature Neuroscience.

[19]  Joel A. Tropp,et al.  Universality laws for randomized dimension reduction, with applications , 2015, ArXiv.

[20]  Christos Thrampoulidis,et al.  Regularized Linear Regression: A Precise Analysis of the Estimation Error , 2015, COLT.

[21]  Allan Sly,et al.  Proof of the Satisfiability Conjecture for Large k , 2014, STOC.

[22]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[23]  David Steurer,et al.  Sum-of-squares proofs and the quest toward optimal algorithms , 2014, Electron. Colloquium Comput. Complex..

[24]  Gad Abraham,et al.  Fast Principal Component Analysis of Large-Scale Genome-Wide Data , 2014, bioRxiv.

[25]  Carolo Friederico Gauss Theoria Motus Corporum Coelestium in Sectionibus Conicis Solem Ambientium , 2014 .

[26]  Adel Javanmard,et al.  Confidence intervals and hypothesis testing for high-dimensional regression , 2013, J. Mach. Learn. Res..

[27]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[28]  Adel Javanmard,et al.  Hypothesis Testing in High-Dimensional Regression Under the Gaussian Random Design Model: Asymptotic Theory , 2013, IEEE Transactions on Information Theory.

[29]  Linda C. van der Gaag,et al.  Probabilistic Graphical Models , 2014, Lecture Notes in Computer Science.

[30]  Noureddine El Karoui,et al.  Asymptotic behavior of unregularized and ridge-regularized high-dimensional robust regression estimators : rigorous results , 2013, 1311.2445.

[31]  Andrea Montanari,et al.  High dimensional robust M-estimation: asymptotic variance via approximate message passing , 2013, Probability Theory and Related Fields.

[32]  P. Bickel,et al.  On robust regression with high-dimensional predictors , 2013, Proceedings of the National Academy of Sciences.

[33]  Martin Wattenberg,et al.  Ad click prediction: a view from the trenches , 2013, KDD.

[34]  Andrea Montanari,et al.  Finding Hidden Cliques of Size \sqrt{N/e} in Nearly Linear Time , 2013, ArXiv.

[35]  D. Panchenko The Sherrington-Kirkpatrick Model , 2013 .

[36]  Shlomo Shamai,et al.  Support Recovery With Sparsely Sampled Free Random Matrices , 2011, IEEE Transactions on Information Theory.

[37]  Adel Javanmard,et al.  State Evolution for General Approximate Message Passing Algorithms, with Applications to Spatial Coupling , 2012, ArXiv.

[38]  Amit Singer,et al.  Exact and Stable Recovery of Rotations for Robust Synchronization , 2012, ArXiv.

[39]  Andrea Montanari,et al.  Universality in Polytope Phase Transitions and Message Passing Algorithms , 2012, ArXiv.

[40]  E. Bolthausen An Iterative Construction of Solutions of the TAP Equations for the Sherrington–Kirkpatrick Model , 2012, 1201.2891.

[41]  Raj Rao Nadakuditi,et al.  The singular values and vectors of low rank perturbations of large rectangular random matrices , 2011, J. Multivar. Anal..

[42]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[43]  Andrea Montanari,et al.  Graphical Models Concepts in Compressed Sensing , 2010, Compressed Sensing.

[44]  Andrea Montanari,et al.  The LASSO Risk for Gaussian Matrices , 2010, IEEE Transactions on Information Theory.

[45]  Jun Yin,et al.  The Isotropic Semicircle Law and Deformation of Wigner Matrices , 2011, 1110.6449.

[46]  Cun-Hui Zhang,et al.  Confidence intervals for low dimensional parameters in high dimensional linear models , 2011, 1110.2563.

[47]  Cristopher Moore,et al.  Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[48]  Andrea Montanari,et al.  The Noise-Sensitivity Phase Transition in Compressed Sensing , 2010, IEEE Transactions on Information Theory.

[49]  Andrea Montanari,et al.  Applications of the Lindeberg Principle in Communications and Statistical Learning , 2010, IEEE Transactions on Information Theory.

[50]  Andrea Montanari,et al.  The dynamics of message passing on dense graphs, with applications to compressed sensing , 2010, 2010 IEEE International Symposium on Information Theory.

[51]  Raj Rao Nadakuditi,et al.  The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices , 2009, 0910.2120.

[52]  Andrea Montanari,et al.  Message-passing algorithms for compressed sensing , 2009, Proceedings of the National Academy of Sciences.

[53]  David L. Donoho,et al.  Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[54]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[55]  C. Donati-Martin,et al.  The largest eigenvalues of finite rank deformation of large Wigner matrices: Convergence and nonuniversality of the fluctuations. , 2007, 0706.0136.

[56]  Rüdiger L. Urbanke,et al.  Modern Coding Theory , 2008 .

[57]  D. Féral,et al.  The Largest Eigenvalue of Rank One Deformation of Large Wigner Matrices , 2006, math/0605624.

[58]  E. Candès,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[59]  M. Talagrand Mean Field Models for Spin Glasses: Some Obnoxious Problems , 2007 .

[60]  D. Paul ASYMPTOTICS OF SAMPLE EIGENSTRUCTURE FOR A LARGE DIMENSIONAL SPIKED COVARIANCE MODEL , 2007 .

[61]  E. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[62]  Shlomo Shamai,et al.  Mutual information and minimum mean-square error in Gaussian channels , 2004, IEEE Transactions on Information Theory.

[63]  J. W. Silverstein,et al.  Eigenvalues of large sample covariance matrices of spiked population models , 2004, math/0408165.

[64]  S. Péché,et al.  Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices , 2004, math/0403022.

[65]  S. Sathiya Keerthi,et al.  A simple and efficient algorithm for gene selection using sparse logistic regression , 2003, Bioinform..

[66]  Robert Krauthgamer,et al.  The Probable Value of the Lovász--Schrijver Relaxations for Maximum Independent Set , 2003, SIAM J. Comput..

[67]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[68]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[69]  Noga Alon,et al.  Finding a large hidden clique in a random graph , 1998, SODA '98.

[70]  Y. Nesterov Semidefinite relaxation and nonconvex quadratic optimization , 1998 .

[71]  S. Kak Information, physics, and computation , 1996 .

[72]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[73]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[74]  Scott Chen,et al.  Examples of basis pursuit , 1995, Optics + Photonics.

[75]  Mark Jerrum,et al.  Large Cliques Elude the Metropolis Process , 1992, Random Struct. Algorithms.

[76]  Y. Gordon On Milman's inequality and random subspaces which escape through a mesh in ℝ n , 1988 .

[77]  M. Mézard,et al.  Spin Glass Theory and Beyond , 1987 .

[78]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[79]  Giorgio Parisi,et al.  Infinite Number of Order Parameters for Spin-Glasses , 1979 .

[80]  S. Kirkpatrick,et al.  Infinite-ranged models of spin-glasses , 1978 .

[81]  R. Palmer,et al.  Solution of 'Solvable model of a spin glass' , 1977 .

[82]  P. J. Huber Robust Regression: Asymptotics, Conjectures and Monte Carlo , 1973 .

[83]  Robert G. Gallager,et al.  Low-density parity-check codes , 1962, IRE Trans. Inf. Theory.

[84]  A. J. Stam Some Inequalities Satisfied by the Quantities of Information of Fisher and Shannon , 1959, Inf. Control..

[85]  L. Landau,et al.  The Theory of Phase Transitions , 1936, Nature.

[86]  H. Bethe Statistical Theory of Superlattices , 1935 .