Estimation of low-rank tensors via convex optimization

In this paper, we propose three approaches for the estimation of the Tucker decomposition of multi-way arrays (tensors) from partial observations. All approaches are formulated as convex minimization problems. Therefore, the minimum is guaranteed to be unique. The proposed approaches can automatically estimate the number of factors (rank) through the optimization. Thus, there is no need to specify the rank beforehand. The key technique we employ is the trace norm regularization, which is a popular approach for the estimation of low-rank matrices. In addition, we propose a simple heuristic to improve the interpretability of the obtained factorization. The advantages and disadvantages of three proposed approaches are demonstrated through numerical experiments on both synthetic and real world datasets. We show that the proposed convex optimization based approaches are more accurate in predictive performance, faster, and more reliable in recovering a known multilinear structure than conventional approaches.

[1]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[2]  M. Powell A method for nonlinear constraints in minimization problems , 1969 .

[3]  M. Hestenes Multiplier and gradient methods , 1969 .

[4]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[5]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[6]  R. Tyrrell Rockafellar,et al.  Convex Analysis , 1970, Princeton Landmarks in Mathematics and Physics.

[7]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[8]  R. Tyrrell Rockafellar,et al.  Augmented Lagrangians and Applications of the Proximal Point Algorithm in Convex Programming , 1976, Math. Oper. Res..

[9]  P. Lions,et al.  Splitting Algorithms for the Sum of Two Nonlinear Operators , 1979 .

[10]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[11]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[12]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[13]  Rasmus Bro,et al.  The N-way Toolbox for MATLAB , 2000 .

[14]  Joos Vandewalle,et al.  On the Best Rank-1 and Rank-(R1 , R2, ... , RN) Approximation of Higher-Order Tensors , 2000, SIAM J. Matrix Anal. Appl..

[15]  Stephen P. Boyd,et al.  A rank minimization heuristic with application to minimum order system approximation , 2001, Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148).

[16]  Rasmus Bro,et al.  Multi-way Analysis with Applications in the Chemical Sciences , 2004 .

[17]  Tommi S. Jaakkola,et al.  Maximum-Margin Matrix Factorization , 2004, NIPS.

[18]  L. Lathauwer,et al.  Dimensionality reduction in higher-order signal processing and rank-(R1,R2,…,RN) reduction in multilinear algebra , 2004 .

[19]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[20]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[21]  Lars Kai Hansen,et al.  Parallel Factor Analysis as an exploratory tool for wavelet transformed event-related EEG , 2006, NeuroImage.

[22]  Kazuyuki Aihara,et al.  Classifying matrices with a spectral regularization , 2007, ICML '07.

[23]  Jieping Ye,et al.  An accelerated gradient method for trace norm minimization , 2009, ICML '09.

[24]  Tom Goldstein,et al.  The Split Bregman Method for L1-Regularized Problems , 2009, SIAM J. Imaging Sci..

[25]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[26]  Jieping Ye,et al.  Tensor Completion for Estimating Missing Values in Visual Data , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Stephen J. Wright,et al.  Sparse Reconstruction by Separable Approximation , 2008, IEEE Transactions on Signal Processing.

[28]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[29]  Berkant Savas,et al.  A Newton-Grassmann Method for Computing the Best Multilinear Rank-(r1, r2, r3) Approximation of a Tensor , 2009, SIAM J. Matrix Anal. Appl..

[30]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[31]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[32]  Convex multilinear estimation and operational representations , 2010, NIPS 2010.

[33]  Masashi Sugiyama,et al.  A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices , 2010, ICML.

[34]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[35]  Kazushi Ikeda,et al.  Exponential Family Tensor Factorization for Missing-Values Prediction and Anomaly Detection , 2010, 2010 IEEE International Conference on Data Mining.

[36]  Tamara G. Kolda,et al.  Scalable Tensor Factorizations with Missing Data , 2010, SDM.

[37]  Tamara G. Kolda,et al.  Scalable Tensor Factorizations for Incomplete Data , 2010, ArXiv.

[38]  Morten Mørup,et al.  Applications of tensor (multiway array) factorizations and decompositions in data mining , 2011, WIREs Data Mining Knowl. Discov..

[39]  Eric C. Chi,et al.  Making Tensor Factorizations Robust to Non-Gaussian Noise , 2010, 1010.3043.

[40]  Masashi Sugiyama,et al.  Augmented Lagrangian Methods for Learning, Selecting, and Combining Features , 2011 .

[41]  J. Suykens,et al.  Nuclear Norms for Tensors and Their Use for Convex Multilinear Estimation , 2011 .

[42]  Masashi Sugiyama,et al.  Super-Linear Convergence of Dual Augmented Lagrangian Algorithm for Sparsity Regularized Estimation , 2009, J. Mach. Learn. Res..

[43]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[44]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[45]  W. Marsden I and J , 2012 .