The Spiked Matrix Model With Generative Priors

We investigate the statistical and algorithmic properties of random neural-network generative priors in a simple inference problem: spiked-matrix estimation. We establish a rigorous expression for the performance of the Bayes-optimal estimator in the high-dimensional regime, and identify the statistical threshold for weak-recovery of the spike. Next, we derive a message-passing algorithm taking into account the latent structure of the spike, and show that its performance is asymptotically optimal for natural choices of the generative network architecture. The absence of an algorithmic gap in this case is in stark contrast to known results for sparse spikes, another popular prior for modelling low-dimensional signals, and for which no algorithm is known to achieve the optimal statistical threshold. Finally, we show that linearising our message passing algorithm yields a simple spectral method also achieving the optimal threshold for reconstruction. We conclude with an experiment on a real data set showing that our bespoke spectral method outperforms vanilla PCA.

[1]  Shlomo Shamai,et al.  Mutual information and minimum mean-square error in Gaussian channels , 2004, IEEE Transactions on Information Theory.

[2]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[3]  J. W. Silverstein,et al.  On the empirical distribution of eigenvalues of a class of large dimensional random matrices , 1995 .

[4]  Florent Krzakala,et al.  Statistical and computational phase transitions in spiked tensor estimation , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[5]  V. Marčenko,et al.  DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES , 1967 .

[6]  西森 秀稔 Statistical physics of spin glasses and information processing : an introduction , 2001 .

[7]  M. Wainwright,et al.  High-dimensional analysis of semidefinite relaxations for sparse principal components , 2008, 2008 IEEE International Symposium on Information Theory.

[8]  Jean-Christophe Mourrat,et al.  Hamilton–Jacobi equations for finite-rank matrix inference , 2019, The Annals of Applied Probability.

[9]  Michael I. Jordan,et al.  Finite Size Corrections and Likelihood Ratio Fluctuations in the Spiked Wigner Model , 2017, ArXiv.

[10]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[11]  Florent Krzakala,et al.  Constrained low-rank matrix estimation: phase transitions, approximate message passing and applications , 2017, ArXiv.

[12]  S. Péché,et al.  Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices , 2004, math/0403022.

[13]  Marc Lelarge,et al.  Fundamental limits of symmetric low-rank matrix estimation , 2016, Probability Theory and Related Fields.

[14]  Dustin G. Mixon,et al.  SUNLayer: Stable denoising with generative networks , 2018, ArXiv.

[15]  G. Parisi,et al.  Recipes for metastable states in spin glasses , 1995 .

[16]  Vladislav Voroninski,et al.  Global Guarantees for Enforcing Deep Generative Priors by Empirical Risk , 2017, IEEE Transactions on Information Theory.

[17]  J. Lee,et al.  Tracy-Widom Distribution for the Largest Eigenvalue of Real Sample Covariance Matrices with General Population , 2014, 1409.4979.

[18]  Satish Babu Korada,et al.  Exact Solution of the Gauge Symmetric p-Spin Glass Model on a Complete Graph , 2009 .

[19]  Galen Reeves Additivity of information in multilayer networks via additive Gaussian noise transforms , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[20]  Florent Krzakala,et al.  Estimation in the Spiked Wigner Model: A Short Proof of the Replica Formula , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[21]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[22]  Elchanan Mossel,et al.  Spectral redemption in clustering sparse networks , 2013, Proceedings of the National Academy of Sciences.

[23]  Sundeep Rangan,et al.  Iterative estimation of constrained rank-one matrices in noise , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[24]  Andrea Montanari,et al.  Sparse PCA via Covariance Thresholding , 2013, J. Mach. Learn. Res..

[25]  Nicolas Macris,et al.  Optimal errors and phase transitions in high-dimensional generalized linear models , 2017, Proceedings of the National Academy of Sciences.

[26]  Andrea Montanari,et al.  Information-theoretically optimal sparse PCA , 2014, 2014 IEEE International Symposium on Information Theory.

[27]  Adel Javanmard,et al.  State Evolution for General Approximate Message Passing Algorithms, with Applications to Spatial Coupling , 2012, ArXiv.

[28]  Léo Miolane Fundamental limits of low-rank matrix estimation , 2017, 1702.00473.

[29]  H. Nishimori Statistical Physics of Spin Glasses and Information Processing , 2001 .

[30]  Ankur Moitra,et al.  Optimality and Sub-optimality of PCA I: Spiked Random Matrix Models , 2018, The Annals of Statistics.

[31]  Ankur Moitra,et al.  Optimality and Sub-optimality of PCA for Spiked Random Matrices and Synchronization , 2016, ArXiv.

[32]  Vladislav Voroninski,et al.  Phase Retrieval Under a Generative Prior , 2018, NeurIPS.

[33]  Florent Krzakala,et al.  Phase transitions in sparse PCA , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[34]  Nicolas Macris,et al.  Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula , 2016, NIPS.

[35]  Sundeep Rangan,et al.  Inference in Deep Networks in High Dimensions , 2017, 2018 IEEE International Symposium on Information Theory (ISIT).

[36]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[37]  Florent Krzakala,et al.  Multi-layer generalized linear estimation , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[38]  Florent Krzakala,et al.  Mutual information in rank-one matrix estimation , 2016, 2016 IEEE Information Theory Workshop (ITW).

[39]  H. Weyl Inequalities between the Two Kinds of Eigenvalues of a Linear Transformation. , 1949, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[41]  Florent Krzakala,et al.  Inferring sparsity: Compressed sensing using generalized restricted Boltzmann machines , 2016, 2016 IEEE Information Theory Workshop (ITW).

[42]  Philippe Rigollet,et al.  Computational Lower Bounds for Sparse PCA , 2013, ArXiv.

[43]  Francis R. Bach,et al.  Structured Sparse Principal Component Analysis , 2009, AISTATS.

[44]  Andrea Montanari,et al.  State Evolution for Approximate Message Passing with Non-Separable Functions , 2017, Information and Inference: A Journal of the IMA.

[45]  Alexandros G. Dimakis,et al.  Compressed Sensing using Generative Models , 2017, ICML.

[46]  A. Montanari,et al.  Asymptotic mutual information for the balanced binary stochastic block model , 2016 .

[47]  Raj Rao Nadakuditi,et al.  The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices , 2009, 0910.2120.

[48]  Richard G. Baraniuk,et al.  From Denoising to Compressed Sensing , 2014, IEEE Transactions on Information Theory.

[49]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[50]  O. Johnson Free Random Variables , 2004 .

[51]  N. Macris,et al.  The adaptive interpolation method: a simple scheme to prove replica formulas in Bayesian inference , 2018, Probability Theory and Related Fields.

[52]  E. Wigner Characteristic Vectors of Bordered Matrices with Infinite Dimensions I , 1955 .

[53]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[54]  F. Guerra Broken Replica Symmetry Bounds in the Mean Field Spin Glass Model , 2002, cond-mat/0205123.

[55]  J. W. Silverstein,et al.  Analysis of the limiting spectral distribution of large dimensional random matrices , 1995 .

[56]  Nicolas Macris,et al.  Entropy and mutual information in models of deep neural networks , 2018, NeurIPS.