Statistical mechanics of low-rank tensor decomposition

Often, large, high dimensional datasets collected across multiple modalities can be organized as a higher order tensor. Low-rank tensor decomposition then arises as a powerful and widely used tool to discover simple low dimensional structures underlying such data. However, we currently lack a theoretical understanding of the algorithmic behavior of low-rank tensor decompositions. We derive Bayesian approximate message passing (AMP) algorithms for recovering arbitrarily shaped low-rank tensors buried within noise, and we employ dynamic mean field theory to precisely characterize their performance. Our theory reveals the existence of phase transitions between easy, hard and impossible inference regimes, and displays an excellent match with simulations. Moreover, it reveals several qualitative surprises compared to the behavior of symmetric, cubic tensor decomposition. Finally, we compare our AMP algorithm to the most commonly used algorithm, alternating least squares (ALS), and demonstrate that AMP significantly outperforms ALS in the presence of noise.

[1]  W. Freeman,et al.  Bethe free energy, Kikuchi approximations, and belief propagation algorithms , 2001 .

[2]  A. Crisanti,et al.  The sphericalp-spin interaction spin glass model: the statics , 1992 .

[3]  Sundeep Rangan,et al.  Iterative estimation of constrained rank-one matrices in noise , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.

[4]  Andrea Montanari,et al.  Message-passing algorithms for compressed sensing , 2009, Proceedings of the National Academy of Sciences.

[5]  Giorgio Parisi,et al.  SK Model: The Replica Solution without Replicas , 1986 .

[6]  M. Mézard,et al.  Replicas and optimization , 1985 .

[7]  Yoshiyuki Kabashima,et al.  A BP-Based Algorithm for Performing Bayesian Inference in Large Perceptron-Type Networks , 2004, ALT.

[8]  Rasmus Bro,et al.  Multiway analysis of epilepsy tensors , 2007, ISMB/ECCB.

[9]  Nikos D. Sidiropoulos,et al.  Tensor Decomposition for Signal Processing and Machine Learning , 2016, IEEE Transactions on Signal Processing.

[10]  西森 秀稔 Statistical physics of spin glasses and information processing : an introduction , 2001 .

[11]  Florent Krzakala,et al.  Statistical and computational phase transitions in spiked tensor estimation , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[12]  M. Mézard The space of interactions in neural networks: Gardner's computation with the cavity method , 1989 .

[13]  John P. Cunningham,et al.  Tensor Analysis Reveals Distinct Population Structure that Parallels the Different Computational Roles of Areas M1 and V1 , 2016, PLoS Comput. Biol..

[14]  Danny Dolev,et al.  Fixing convergence of Gaussian belief propagation , 2009, 2009 IEEE International Symposium on Information Theory.

[15]  Florent Krzakala,et al.  Constrained low-rank matrix estimation: phase transitions, approximate message passing and applications , 2017, ArXiv.

[16]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[17]  Florent Krzakala,et al.  Swept Approximate Message Passing for Sparse Estimation , 2015, ICML.

[18]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[19]  Surya Ganguli,et al.  Short-term memory in neuronal networks through dynamical compressed sensing , 2010, NIPS.

[20]  Sundeep Rangan,et al.  On the convergence of approximate message passing with arbitrary matrices , 2014, 2014 IEEE International Symposium on Information Theory.

[21]  Patrick Dupont,et al.  Tensor decompositions and data fusion in epileptic electroencephalography and functional magnetic resonance imaging data , 2017, WIREs Data Mining Knowl. Discov..

[22]  Surya Ganguli,et al.  An equivalence between high dimensional Bayes optimal inference and M-estimation , 2016, NIPS.

[23]  Surya Ganguli,et al.  Statistical mechanics of compressed sensing. , 2010, Physical review letters.

[24]  Andrea Crisanti,et al.  Thouless-Anderson-Palmer Approach to the Spherical p-Spin Spin Glass Model , 1995 .

[25]  Yoshiyuki Kabashima,et al.  Erratum: A typical reconstruction limit of compressed sensing based on Lp-norm minimization , 2009, ArXiv.

[26]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[27]  H. Bethe Statistical Theory of Superlattices , 1935 .

[28]  Florent Krzakala,et al.  Statistical physics of inference: thresholds and algorithms , 2015, ArXiv.

[29]  S. Kak Information, physics, and computation , 1996 .

[30]  Eero P. Simoncelli,et al.  Attention stabilizes the shared gain of V4 populations , 2015, eLife.

[31]  Nicolas Macris,et al.  The layered structure of tensor estimation and its mutual information , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[32]  Florent Krzakala,et al.  Phase transitions in sparse PCA , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[33]  Surya Ganguli,et al.  Statistical Mechanics of Optimal Convex Inference in High Dimensions , 2016 .

[34]  Florent Krzakala,et al.  MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[35]  Danny Bickson,et al.  Gaussian Belief Propagation: Theory and Aplication , 2008, 0811.2518.

[36]  H. Nishimori Statistical Physics of Spin Glasses and Information Processing , 2001 .

[37]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[38]  Andrea Montanari,et al.  The dynamics of message passing on dense graphs, with applications to compressed sensing , 2010, ISIT.

[39]  R. Palmer,et al.  Solution of 'Solvable model of a spin glass' , 1977 .

[40]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[41]  M. Pretti A message-passing algorithm with damping , 2005 .

[42]  Andrea Montanari,et al.  A statistical model for tensor PCA , 2014, NIPS.

[43]  Surya Ganguli,et al.  Unsupervised Discovery of Demixed, Low-Dimensional Neural Dynamics across Multiple Timescales through Tensor Component Analysis , 2017, Neuron.

[44]  Sundeep Rangan,et al.  Asymptotic Analysis of MAP Estimation via the Replica Method and Compressed Sensing , 2009, NIPS.

[45]  S. Ganguli,et al.  Statistical mechanics of complex neural systems and high dimensional data , 2013, 1301.7115.