Tensor Factorization via Matrix Factorization

Tensor factorization arises in many machine learning applications, such knowledge base modeling and parameter estimation in latent variable models. However, numerical methods for tensor factorization have not reached the level of maturity of matrix factorization methods. In this paper, we propose a new method for CP tensor factorization that uses random projections to reduce the problem to simultaneous matrix diagonalization. Our method is conceptually simple and also applies to non-orthogonal and asymmetric tensors of arbitrary order. We prove that a small number random projections essentially preserves the spectral information in the tensor, allowing us to remove the dependence on the eigengap that plagued earlier tensor-to-matrix reductions. Experimentally, our method outperforms existing tensor factorization methods on both simulated data and two real datasets.

[1]  J. Kruskal Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics , 1977 .

[2]  Johan Håstad,et al.  Tensor Rank is NP-Complete , 1989, ICALP.

[3]  Johan Håstad Tensor Rank is NP-Complete , 1990, J. Algorithms.

[4]  A. Bunse-Gerstner,et al.  Numerical Methods for Simultaneous Diagonalization , 1993, SIAM J. Matrix Anal. Appl..

[5]  Jean-Francois Cardoso,et al.  Perturbation of joint diagonalizers , 1994 .

[6]  Antoine Souloumiac,et al.  Jacobi Angles for Simultaneous Diagonalization , 1996, SIAM J. Matrix Anal. Appl..

[7]  A. V. D. Vaart,et al.  Asymptotic Statistics: U -Statistics , 1998 .

[8]  A. V. D. Vaart Asymptotic Statistics: Delta Method , 1998 .

[9]  P. Massart,et al.  Adaptive estimation of a quadratic functional by model selection , 2000 .

[10]  Joos Vandewalle,et al.  Independent component analysis and (simultaneous) third-order tensor diagonalization , 2001, IEEE Trans. Signal Process..

[11]  Arie Yeredor,et al.  Non-orthogonal joint diagonalization in the least-squares sense with application in blind source separation , 2002, IEEE Trans. Signal Process..

[12]  Arie Yeredor,et al.  Approximate Joint Diagonalization Using a Natural Gradient Approach , 2004, ICA.

[13]  Andreas Ziehe,et al.  A Fast Algorithm for Joint Diagonalization with Non-orthogonal Transformations and its Application to Blind Source Separation , 2004, J. Mach. Learn. Res..

[14]  Bijan Afsari,et al.  Simple LU and QR Based Non-orthogonal Matrix Joint Diagonalization , 2006, ICA.

[15]  Lieven De Lathauwer,et al.  A Link between the Canonical Decomposition in Multilinear Algebra and Simultaneous Matrix Diagonalization , 2006, SIAM J. Matrix Anal. Appl..

[16]  Klaus Obermayer,et al.  Quadratic optimization for simultaneous matrix diagonalization , 2006, IEEE Transactions on Signal Processing.

[17]  Vin de Silva,et al.  Tensor rank and the ill-posedness of the best low-rank approximation problem , 2006, math/0607647.

[18]  Bijan Afsari,et al.  Sensitivity Analysis for the Problem of Matrix Joint Diagonalization , 2008, SIAM J. Matrix Anal. Appl..

[19]  Antoine Souloumiac,et al.  Joint diagonalization: Is non-orthogonal always preferable to orthogonal? , 2009, 2009 3rd IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[20]  P. Comon,et al.  Tensor decompositions, alternating least squares and other tales , 2009 .

[21]  Pierre Comon,et al.  Symmetric tensor decomposition , 2009, 2009 17th European Signal Processing Conference.

[22]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[23]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[24]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[25]  Anima Anandkumar,et al.  Two SVDs Suffice: Spectral decompositions for probabilistic topic modeling and latent Dirichlet allocation , 2012, NIPS 2012.

[26]  Anima Anandkumar,et al.  A Method of Moments for Mixture Models and Hidden Markov Models , 2012, COLT.

[27]  Anima Anandkumar,et al.  A Tensor Spectral Approach to Learning Mixed Membership Community Models , 2013, COLT.

[28]  Anima Anandkumar,et al.  Fast Detection of Overlapping Communities via Online Tensor Methods on GPUs , 2013, ArXiv.

[29]  David Sontag,et al.  Unsupervised Learning of Noisy-Or Bayesian Networks , 2013, UAI.

[30]  Christopher J. Hillar,et al.  Most Tensor Problems Are NP-Hard , 2009, JACM.

[31]  Gal Chechik,et al.  Coordinate-descent for learning orthogonal matrices through Givens rotations , 2014, ICML.

[32]  Anima Anandkumar,et al.  Guaranteed Non-Orthogonal Tensor Decomposition via Alternating Rank-1 Updates , 2014, ArXiv.

[33]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[34]  Percy Liang,et al.  Estimating Latent-Variable Graphical Models using Moments and Likelihoods , 2014, ICML.

[35]  Xi Chen,et al.  Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..

[36]  Percy Liang,et al.  Simultaneous diagonalization: the asymmetric, low-rank, and noisy settings , 2015, ArXiv.