Perturbative construction of mean-field equations in extensive-rank matrix factorization and denoising

Factorization of matrices where the rank of the two factors diverges linearly with their sizes has many applications in diverse areas such as unsupervised representation learning, dictionary learning or sparse coding. We consider a setting where the two factors are generated from known componentwise independent prior distributions, and the statistician observes a (possibly noisy) componentwise function of their matrix product. In the limit where the dimensions of the matrices tend to infinity, but their ratios remain fixed, we expect to be able to derive closed form expressions for the optimal mean squared error on the estimation of the two factors. However, this remains a very involved mathematical and algorithmic problem. A related, but simpler, problem is extensive-rank matrix denoising, where one aims to reconstruct a matrix with extensive but usually small rank from noisy measurements. In this paper, we approach both these problems using high-temperature expansions at fixed order parameters. This allows to clarify how previous attempts at solving these problems failed at finding an asymptotically exact solution. We provide a systematic way to derive the corrections to these existing approximations, taking into account the structure of correlations particular to the problem. Finally, we illustrate our approach in detail on the case of extensive-rank matrix denoising. We compare our results with known optimal rotationally-invariant estimators, and show how exact asymptotic calculations of the minimal error can be performed using extensiverank matrix integrals.

[1]  Harish-Chandra Differential Operators on a Semisimple Lie Algebra , 1957 .

[2]  V. Hutson Integral Equations , 1967, Nature.

[3]  V. Marčenko,et al.  DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES , 1967 .

[4]  S. Kirkpatrick,et al.  Solvable Model of a Spin-Glass , 1975 .

[5]  R. Palmer,et al.  Solution of 'Solvable model of a spin glass' , 1977 .

[6]  C. Itzykson The planar approximation , 1980 .

[7]  C. Itzykson,et al.  The planar approximation. II , 1980 .

[8]  T. Plefka Convergence condition of the TAP equation for the infinite-ranged Ising spin glass model , 1982 .

[9]  D. Voiculescu Addition of certain non-commuting random variables , 1986 .

[10]  M. Mézard,et al.  Spin Glass Theory And Beyond: An Introduction To The Replica Method And Its Applications , 1986 .

[11]  Giorgio Parisi,et al.  SK Model: The Replica Solution without Replicas , 1986 .

[12]  M. Mézard The space of interactions in neural networks: Gardner's computation with the cavity method , 1989 .

[13]  J. Yedidia,et al.  How to expand around mean-field theory using high-temperature expansions , 1991 .

[14]  C. Tracy,et al.  Introduction to Random Matrices , 1992, hep-th/9210073.

[15]  A. Matytsin On the large-N limit of the Itzykson-Zuber integral , 1993, hep-th/9306077.

[16]  S. Kak Information, physics, and computation , 1996 .

[17]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[18]  Eric Moulines,et al.  A blind source separation technique using second-order statistics , 1997, IEEE Trans. Signal Process..

[19]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[20]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[21]  M. Opper,et al.  Tractable approximations for probabilistic models: the adaptive Thouless-Anderson-Palmer mean field approach. , 2001, Physical review letters.

[22]  M. Opper,et al.  Adaptive and self-averaging Thouless-Anderson-Palmer mean-field theory for probabilistic modeling. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  西森 秀稔 Statistical physics of spin glasses and information processing : an introduction , 2001 .

[24]  A. Guionnet First Order Asymptotics of Matrix Integrals; A Rigorous Approach Towards the Understanding of Matrix Models , 2002, math/0211131.

[25]  A. Guionnet,et al.  Large Deviations Asymptotics for Spherical Integrals , 2002 .

[26]  Joseph F. Murray,et al.  Dictionary Learning Algorithms for Sparse Representation , 2003, Neural Computation.

[27]  S. Péché,et al.  Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices , 2004, math/0403022.

[28]  A. Guionnet Large deviations and stochastic calculus for large random matrices , 2004, math/0409277.

[29]  Shlomo Shamai,et al.  Mutual information and minimum mean-square error in Gaussian channels , 2004, IEEE Transactions on Information Theory.

[30]  M. Stephanov,et al.  Random Matrices , 2005, hep-ph/0509286.

[31]  Ole Winther,et al.  Expectation Consistent Approximate Inference , 2005, J. Mach. Learn. Res..

[32]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[33]  F. Benaych-Georges Rectangular R-Transform as the Limit of Rectangular Spherical Integrals , 2009, 0909.0178.

[34]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[35]  Andrea Montanari,et al.  The dynamics of message passing on dense graphs, with applications to compressed sensing , 2010, 2010 IEEE International Symposium on Information Theory.

[36]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[37]  Sundeep Rangan,et al.  Generalized approximate message passing for estimation with random linear mixing , 2010, 2011 IEEE International Symposium on Information Theory Proceedings.

[38]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[39]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[40]  Andrea Montanari,et al.  Universality in Polytope Phase Transitions and Message Passing Algorithms , 2012, ArXiv.

[41]  Adel Javanmard,et al.  State Evolution for General Approximate Message Passing Algorithms, with Applications to Spatial Coupling , 2012, ArXiv.

[42]  E. Bolthausen An Iterative Construction of Solutions of the TAP Equations for the Sherrington–Kirkpatrick Model , 2012, 1201.2891.

[43]  L. Nicolaescu Complexity of random smooth functions on compact manifolds , 2012, 1201.4972.

[44]  F. Ricci-Tersenghi The Bethe approximation for solving the inverse Ising problem: a comparison with other inference methods , 2011, 1112.4814.

[45]  Ayaka Sakata,et al.  Statistical mechanics of dictionary learning , 2012, ArXiv.

[46]  Florent Krzakala,et al.  Phase diagram and approximate message passing for blind calibration and dictionary learning , 2013, 2013 IEEE International Symposium on Information Theory.

[47]  Adel Javanmard,et al.  Information-Theoretically Optimal Compressed Sensing via Spatial Coupling and Approximate Message Passing , 2011, IEEE Transactions on Information Theory.

[48]  J. Bouchaud,et al.  Instanton approach to large N Harish-Chandra-Itzykson-Zuber integrals. , 2014, Physical review letters.

[49]  Volkan Cevher,et al.  Bilinear Generalized Approximate Message Passing—Part II: Applications , 2014, IEEE Transactions on Signal Processing.

[50]  Volkan Cevher,et al.  Bilinear Generalized Approximate Message Passing—Part I: Derivation , 2013, IEEE Transactions on Signal Processing.

[51]  Ole Winther,et al.  A theory of solving TAP equations for Ising models with general invariant random matrices , 2015, ArXiv.

[52]  Florent Krzakala,et al.  Statistical physics of inference: thresholds and algorithms , 2015, ArXiv.

[53]  LECTURE NOTES 4 FOR 247A , 2015 .

[54]  Hydrodynamical spectral evolution for random matrices , 2015, 1507.07274.

[55]  Sundeep Rangan,et al.  Vector approximate message passing for the generalized linear model , 2016, 2016 50th Asilomar Conference on Signals, Systems and Computers.

[56]  Jean-Philippe Bouchaud,et al.  Rotational Invariant Estimator for General Noisy Matrices , 2015, IEEE Transactions on Information Theory.

[57]  Jean-Philippe Bouchaud,et al.  Cleaning large correlation matrices: tools from random matrix theory , 2016, 1610.08104.

[58]  Florent Krzakala,et al.  Phase Transitions and Sample Complexity in Bayes-Optimal Matrix Factorization , 2014, IEEE Transactions on Information Theory.

[59]  Florent Krzakala,et al.  Statistical and computational phase transitions in spiked tensor estimation , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[60]  Sundeep Rangan,et al.  Vector approximate message passing , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[61]  Florent Krzakala,et al.  Constrained low-rank matrix estimation: phase transitions, approximate message passing and applications , 2017, ArXiv.

[62]  Govind Menon THE COMPLEX BURGERS EQUATION, THE HCIZ INTEGRAL AND THE CALOGERO-MOSER SYSTEM , 2017 .

[63]  Florent Krzakala,et al.  Estimation in the Spiked Wigner Model: A Short Proof of the Replica Formula , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).

[64]  I. Johnstone,et al.  Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model. , 2013, Annals of statistics.

[65]  Pierpaolo Vivo,et al.  Introduction to Random Matrices: Theory and Practice , 2017, 1712.07903.

[66]  Hinnerk Christian Schmidt Statistical Physics of Sparse and Dense Models in Optimization and Inference , 2018 .

[67]  Florent Krzakala,et al.  High-temperature expansions and message passing algorithms , 2019, Journal of Statistical Mechanics: Theory and Experiment.

[68]  Nicolas Macris,et al.  Optimal errors and phase transitions in high-dimensional generalized linear models , 2017, Proceedings of the National Academy of Sciences.

[69]  Raphael Berthier,et al.  Graph-based Approximate Message Passing Iterations , 2021, ArXiv.

[70]  Alice Guionnet,et al.  Large Deviations Asymptotics of Rectangular Spherical Integral , 2021, 2106.07146.

[71]  Statistical limits of dictionary learning: random matrix theory and the spectral replica method , 2021, ArXiv.

[72]  Hongwen Yang,et al.  Multi-Layer Bilinear Generalized Approximate Message Passing , 2020, IEEE Transactions on Signal Processing.