All-or-nothing statistical and computational phase transitions in sparse spiked matrix estimation

We determine statistical and computational limits for estimation of a rank-one matrix (the spike) corrupted by an additive gaussian noise matrix, in a sparse limit, where the underlying hidden vector (that constructs the rank-one matrix) has a number of non-zero components that scales sub-linearly with the total dimension of the vector, and the signal-to-noise ratio tends to infinity at an appropriate speed. We prove explicit low-dimensional variational formulas for the asymptotic mutual information between the spike and the observed noisy matrix and analyze the approximate message passing algorithm in the sparse regime. For Bernoulli and Bernoulli-Rademacher distributed vectors, and when the sparsity and signal strength satisfy an appropriate scaling relation, we find all-or-nothing phase transitions for the asymptotic minimum and algorithmic mean-square errors. These jump from their maximum possible value to zero, at well defined signal-to-noise thresholds whose asymptotic values we determine exactly. In the asymptotic regime the statistical-to-algorithmic gap diverges indicating that sparse recovery is hard for approximate message passing.

[1]  Stphane Mallat,et al.  A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way , 2008 .

[2]  Y. Kabashima A CDMA multiuser detection algorithm on the basis of belief propagation , 2003 .

[3]  Nicolas Macris,et al.  The mutual information in random linear estimation , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[4]  Nicolas Macris,et al.  The layered structure of tensor estimation and its mutual information , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[5]  Quentin Berthet,et al.  Statistical and computational trade-offs in estimation of sparse principal components , 2014, 1408.5369.

[6]  Philip Schniter,et al.  Hyperspectral Unmixing Via Turbo Bilinear Approximate Message Passing , 2015, IEEE Transactions on Computational Imaging.

[7]  Jean-Christophe Mourrat,et al.  Hamilton–Jacobi equations for finite-rank matrix inference , 2019, The Annals of Applied Probability.

[8]  T. Cai,et al.  Optimal estimation and rank detection for sparse spiked covariance matrices , 2013, Probability theory and related fields.

[9]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[10]  Nicolas Macris,et al.  Mutual Information for Low-Rank Even-Order Symmetric Tensor Factorization , 2019, 2019 IEEE Information Theory Workshop (ITW).

[11]  Trevor Hastie,et al.  Statistical Learning with Sparsity: The Lasso and Generalizations , 2015 .

[12]  B. Nadler,et al.  DO SEMIDEFINITE RELAXATIONS SOLVE SPARSE PCA UP TO THE INFORMATION LIMIT , 2013, 1306.3690.

[13]  Volkan Cevher,et al.  Bilinear Generalized Approximate Message Passing—Part I: Derivation , 2013, IEEE Transactions on Signal Processing.

[14]  Wasim Huleihel,et al.  Reducibility and Computational Lower Bounds for Problems with Planted Sparse Structure , 2018, COLT.

[15]  S. Péché The largest eigenvalue of small rank perturbations of Hermitian random matrices , 2004, math/0411487.

[16]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[17]  Sundeep Rangan,et al.  Generalized approximate message passing for estimation with random linear mixing , 2010, 2011 IEEE International Symposium on Information Theory Proceedings.

[18]  Isaac Levi,et al.  Information and inference , 2004, Synthese.

[19]  Shlomo Shamai,et al.  Estimation in Gaussian Noise: Properties of the Minimum Mean-Square Error , 2010, IEEE Transactions on Information Theory.

[20]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[21]  Florent Krzakala,et al.  Statistical and computational phase transitions in spiked tensor estimation , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[22]  Léo Miolane Fundamental limits of low-rank matrix estimation , 2017, 1702.00473.

[23]  Genady Grabarnik,et al.  Sparse Modeling: Theory, Algorithms, and Applications , 2014 .

[24]  Andrea Montanari,et al.  Information-theoretically optimal sparse PCA , 2014, 2014 IEEE International Symposium on Information Theory.

[25]  Ankur Moitra,et al.  Optimality and Sub-optimality of PCA I: Spiked Random Matrix Models , 2018, The Annals of Statistics.

[26]  Florent Krzakala,et al.  Probabilistic reconstruction in compressed sensing: algorithms, phase diagrams, and threshold achieving matrices , 2012, ArXiv.

[27]  D. Féral,et al.  The Largest Eigenvalue of Rank One Deformation of Large Wigner Matrices , 2006, math/0605624.

[28]  David Gamarnik,et al.  The overlap gap property in principal submatrix recovery , 2019, Probability Theory and Related Fields.

[29]  Marc Lelarge,et al.  Fundamental limits of symmetric low-rank matrix estimation , 2016, Probability Theory and Related Fields.

[30]  Florent Krzakala,et al.  Mutual information in rank-one matrix estimation , 2016, 2016 IEEE Information Theory Workshop (ITW).

[31]  Florent Krzakala,et al.  Statistical physics of inference: thresholds and algorithms , 2015, ArXiv.

[32]  S. Kak Information, physics, and computation , 1996 .

[33]  E. Bolthausen An Iterative Construction of Solutions of the TAP Equations for the Sherrington–Kirkpatrick Model , 2012, 1201.2891.

[34]  Andrea Montanari,et al.  Estimation of low-rank matrices via approximate message passing , 2017, The Annals of Statistics.

[35]  N. Macris,et al.  The adaptive interpolation method: a simple scheme to prove replica formulas in Bayesian inference , 2018, Probability Theory and Related Fields.

[36]  Nicolas Macris,et al.  Mutual Information and Optimality of Approximate Message-Passing in Random Linear Estimation , 2017, IEEE Transactions on Information Theory.

[37]  Nicolas Macris,et al.  Optimal errors and phase transitions in high-dimensional generalized linear models , 2017, Proceedings of the National Academy of Sciences.

[38]  E. Candès,et al.  Stable signal recovery from incomplete and inaccurate measurements , 2005, math/0503066.

[39]  Adel Javanmard,et al.  State Evolution for General Approximate Message Passing Algorithms, with Applications to Spatial Coupling , 2012, ArXiv.

[40]  Satish Babu Korada,et al.  Exact Solution of the Gauge Symmetric p-Spin Glass Model on a Complete Graph , 2009 .

[41]  Andrea Montanari,et al.  The dynamics of message passing on dense graphs, with applications to compressed sensing , 2010, 2010 IEEE International Symposium on Information Theory.

[42]  Matthias Hein,et al.  Intrinsic dimensionality estimation of submanifolds in Rd , 2005, ICML.

[43]  Michael I. Jordan,et al.  Finite Size Corrections and Likelihood Ratio Fluctuations in the Spiked Wigner Model , 2017, ArXiv.

[44]  Avi Wigderson,et al.  Sum-of-Squares Lower Bounds for Sparse PCA , 2015, NIPS.

[45]  Michael I. Jordan,et al.  Detection limits in the high-dimensional spiked rectangular model , 2018, COLT.

[46]  Andrea Montanari,et al.  Sparse PCA via Covariance Thresholding , 2013, J. Mach. Learn. Res..

[47]  Florent Krzakala,et al.  Constrained low-rank matrix estimation: phase transitions, approximate message passing and applications , 2017, ArXiv.

[48]  Andrea Montanari,et al.  Finding One Community in a Sparse Graph , 2015, Journal of Statistical Physics.

[49]  Florent Krzakala,et al.  Estimation in the Spiked Wigner Model: A Short Proof of the Replica Formula , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[50]  Philippe Rigollet,et al.  Complexity Theoretic Lower Bounds for Sparse Principal Component Detection , 2013, COLT.

[51]  Andrea Montanari,et al.  Message-passing algorithms for compressed sensing , 2009, Proceedings of the National Academy of Sciences.

[52]  Martin J. Wainwright,et al.  High-dimensional analysis of semidefinite relaxations for sparse principal components , 2008, ISIT.

[53]  Nicolas Macris,et al.  The adaptive interpolation method for proving replica formulas. Applications to the Curie-Weiss and Wigner spike models , 2019, ArXiv.

[54]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[55]  Galen Reeves,et al.  The replica-symmetric prediction for compressed sensing with Gaussian matrices is exact , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[56]  David Gamarnik,et al.  High Dimensional Regression with Binary Coefficients. Estimating Squared Error and a Phase Transtition , 2017, COLT.

[57]  Paul R. Milgrom,et al.  Envelope Theorems for Arbitrary Choice Sets , 2002 .

[58]  Andrea Montanari,et al.  Non-Negative Principal Component Analysis: Message Passing Algorithms and Sharp Asymptotics , 2014, IEEE Transactions on Information Theory.

[59]  Sundeep Rangan,et al.  Iterative Reconstruction of Rank-One Matrices in Noise , 2012 .

[60]  S. Péché,et al.  Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices , 2004, math/0403022.

[61]  S. Mallat A wavelet tour of signal processing , 1998 .

[62]  Nicolas Macris,et al.  Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula , 2016, NIPS.

[63]  Alfred O. Hero,et al.  Learning intrinsic dimension and intrinsic entropy of high-dimensional datasets , 2004, 2004 12th European Signal Processing Conference.

[64]  Ramji Venkataramanan,et al.  Finite Sample Analysis of Approximate Message Passing Algorithms , 2016, IEEE Transactions on Information Theory.

[65]  Andrea Montanari,et al.  Graphical Models Concepts in Compressed Sensing , 2010, Compressed Sensing.

[66]  Shlomo Shamai,et al.  Mutual information and minimum mean-square error in Gaussian channels , 2004, IEEE Transactions on Information Theory.

[67]  Galen Reeves,et al.  The All-or-Nothing Phenomenon in Sparse Linear Regression , 2019, COLT.