When Does Non-Orthogonal Tensor Decomposition Have No Spurious Local Minima?

We study the optimization problem for decomposing $d$ dimensional fourth-order Tensors with $k$ non-orthogonal components. We derive \textit{deterministic} conditions under which such a problem does not have spurious local minima. In particular, we show that if $\kappa = \frac{\lambda_{max}}{\lambda_{min}} < \frac{5}{4}$, and incoherence coefficient is of the order $O(\frac{1}{\sqrt{d}})$, then all the local minima are globally optimal. Using standard techniques, these conditions could be easily transformed into conditions that would hold with high probability in high dimensions when the components are generated randomly. Finally, we prove that the tensor power method with deflation and restarts could efficiently extract all the components within a tolerance level $O(\kappa \sqrt{k\tau^3})$ that seems to be the noise floor of non-orthogonal tensor decomposition.

[1]  Qingqing Huang,et al.  Learning Mixtures of Gaussians in High Dimensions , 2015, STOC.

[2]  S. Leurgans,et al.  A Decomposition for Three-Way Arrays , 1993, SIAM J. Matrix Anal. Appl..

[3]  Johan Håstad,et al.  Tensor Rank is NP-Complete , 1989, ICALP.

[4]  Tengyu Ma,et al.  On the optimization landscape of tensor decompositions , 2017, Mathematical Programming.

[5]  Anima Anandkumar,et al.  Two SVDs Suffice: Spectral decompositions for probabilistic topic modeling and latent Dirichlet allocation , 2012, NIPS 2012.

[6]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[7]  Mahmood Al-khassaweneh,et al.  A tensor based framework for community detection in dynamic networks , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  J. Kruskal Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics , 1977 .

[9]  André Uschmajew,et al.  Local Convergence of the Alternating Least Squares Algorithm for Canonical Tensor Approximation , 2012, SIAM J. Matrix Anal. Appl..

[10]  Kamyar Azizzadenesheli,et al.  Reinforcement Learning of POMDPs using Spectral Methods , 2016, COLT.

[11]  Yan Liu,et al.  Model Selection for Topic Models via Spectral Decomposition , 2015, AISTATS.

[12]  Quoc V. Le,et al.  ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning , 2011, NIPS.

[13]  Vatsal Sharan,et al.  Orthogonalized ALS: A Theoretically Principled Tensor Decomposition Algorithm for Practical Use , 2017, ICML.

[14]  Anima Anandkumar,et al.  A tensor approach to learning mixed membership community models , 2013, J. Mach. Learn. Res..

[15]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[16]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[17]  Anima Anandkumar,et al.  Guaranteed Non-Orthogonal Tensor Decomposition via Alternating Rank-1 Updates , 2014, ArXiv.

[18]  Percy Liang,et al.  Estimating Latent-Variable Graphical Models using Moments and Likelihoods , 2014, ICML.

[19]  Michael I. Jordan,et al.  Gradient Descent Converges to Minimizers , 2016, ArXiv.

[20]  Le Song,et al.  Hierarchical Tensor Decomposition of Latent Tree Graphical Models , 2013, ICML.

[21]  Christopher J. Hillar,et al.  Most Tensor Problems Are NP-Hard , 2009, JACM.