Homotopy Analysis for Tensor PCA

Developing efficient and guaranteed nonconvex algorithms has been an important challenge in modern machine learning. Algorithms with good empirical performance such as stochastic gradient descent often lack theoretical guarantees. In this paper, we analyze the class of homotopy or continuation methods for global optimization of nonconvex functions. These methods start from an objective function that is efficient to optimize (e.g. convex), and progressively modify it to obtain the required objective, and the solutions are passed along the homotopy path. For the challenging problem of tensor PCA, we prove global convergence of the homotopy method in the "high noise" regime. The signal-to-noise requirement for our algorithm is tight in the sense that it matches the recovery guarantee for the best degree-4 sum-of-squares algorithm. In addition, we prove a phase transition along the homotopy path for tensor PCA. This allows to simplify the homotopy method to a local search algorithm, viz., tensor power iterations, with a specific initialization and a noise injection procedure, while retaining the theoretical guarantees.

[1]  P. Massart,et al.  Adaptive estimation of a quadratic functional by model selection , 2000 .

[2]  M. Ledoux Deviation Inequalities on Largest Eigenvalues , 2007 .

[3]  Andrea Montanari,et al.  Message-passing algorithms for compressed sensing , 2009, Proceedings of the National Academy of Sciences.

[4]  Adel Javanmard,et al.  State Evolution for General Approximate Message Passing Algorithms, with Applications to Spatial Coupling , 2012, ArXiv.

[5]  P. Rigollet,et al.  Optimal detection of sparse principal components in high dimension , 2012, 1202.5070.

[6]  Alex Bloemendal,et al.  Limits of spiked random matrices I , 2010, Probability Theory and Related Fields.

[7]  Christopher J. Hillar,et al.  Most Tensor Problems Are NP-Hard , 2009, JACM.

[8]  Prateek Jain,et al.  Low-rank matrix completion using alternating minimization , 2012, STOC '13.

[9]  Andrea Montanari,et al.  A statistical model for tensor PCA , 2014, NIPS.

[10]  Prateek Jain,et al.  Non-convex Robust PCA , 2014, NIPS.

[11]  Prateek Jain,et al.  Learning Sparsely Used Overcomplete Dictionaries , 2014, COLT.

[12]  Hossein Mobahi,et al.  On the Link between Gaussian Homotopy Continuation and Convex Envelopes , 2015, EMMCVPR.

[13]  Ryota Tomioka,et al.  Spectral norm of random tensors , 2014, 1407.1870.

[14]  Sanjeev Arora,et al.  New Algorithms for Learning Incoherent and Overcomplete Dictionaries , 2013, COLT.

[15]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[16]  John Wright,et al.  Complete dictionary recovery over the sphere , 2015, 2015 International Conference on Sampling Theory and Applications (SampTA).

[17]  Jonathan Shi,et al.  Tensor principal component analysis via sum-of-square proofs , 2015, COLT.

[18]  Furong Huang,et al.  Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.

[19]  Hossein Mobahi,et al.  A Theoretical Analysis of Optimization by Gaussian Continuation , 2015, AAAI.

[20]  Yann LeCun,et al.  The Loss Surfaces of Multilayer Networks , 2014, AISTATS.

[21]  Anima Anandkumar,et al.  Learning Overcomplete Latent Variable Models through Tensor Methods , 2014, COLT.

[22]  John Wright,et al.  A Geometric Analysis of Phase Retrieval , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[23]  Constantine Caramanis,et al.  Solving a Mixture of Many Random Linear Equations by Tensor Decomposition and Alternating Minimization , 2016, ArXiv.

[24]  Tengyu Ma,et al.  Matrix Completion has No Spurious Local Minimum , 2016, NIPS.

[25]  H. Mobahi Closed Form for Some Gaussian Convolutions , 2016, 1602.05610.

[26]  Shai Shalev-Shwartz,et al.  On Graduated Optimization for Stochastic Non-Convex Problems , 2015, ICML.

[27]  Ankur Moitra,et al.  Optimality and Sub-optimality of PCA for Spiked Random Matrices and Synchronization , 2016, ArXiv.