Provable Online CP/PARAFAC Decomposition of a Structured Tensor via Dictionary Learning

We consider the problem of factorizing a structured 3-way tensor into its constituent Canonical Polyadic (CP) factors. This decomposition, which can be viewed as a generalization of singular value decomposition (SVD) for tensors, reveals how the tensor dimensions (features) interact with each other. However, since the factors are a priori unknown, the corresponding optimization problems are inherently non-convex. The existing guaranteed algorithms which handle this non-convexity incur an irreducible error (bias), and only apply to cases where all factors have the same structure. To this end, we develop a provable algorithm for online structured tensor factorization, wherein one of the factors obeys some incoherence conditions, and the others are sparse. Specifically we show that, under some relatively mild conditions on initialization, rank, and sparsity, our algorithm recovers the factors exactly (up to scaling and permutation) at a linear rate. Complementary to our theoretical results, our synthetic and real-world data evaluations showcase superior performance compared to related techniques. Moreover, its scalability and ability to learn on-the-fly makes it suitable for real-world tasks.

[1]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[2]  Tselil Schramm,et al.  Fast and robust tensor decomposition with applications to dictionary learning , 2017, COLT.

[3]  André Uschmajew,et al.  Local Convergence of the Alternating Least Squares Algorithm for Canonical Tensor Approximation , 2012, SIAM J. Matrix Anal. Appl..

[4]  Nicolas Gillis,et al.  Dictionary-Based Tensor Canonical Polyadic Decomposition , 2017, IEEE Transactions on Signal Processing.

[5]  Nikos D. Sidiropoulos,et al.  A Flexible and Efficient Algorithmic Framework for Constrained Matrix and Tensor Factorization , 2015, IEEE Transactions on Signal Processing.

[6]  Han Liu,et al.  Provable sparse tensor decomposition , 2015, 1502.01425.

[7]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[8]  Antonin Chambolle,et al.  Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage , 1998, IEEE Trans. Image Process..

[9]  Nikos D. Sidiropoulos,et al.  Joint Tensor Factorization and Outlying Slab Suppression With Applications , 2015, IEEE Transactions on Signal Processing.

[10]  Xiaodong Li,et al.  Phase Retrieval via Wirtinger Flow: Theory and Algorithms , 2014, IEEE Transactions on Information Theory.

[11]  Karin Schnass,et al.  Dictionary Identification—Sparse Matrix-Factorization via $\ell_1$ -Minimization , 2009, IEEE Transactions on Information Theory.

[12]  Nikos D. Sidiropoulos,et al.  Tensor Decomposition for Signal Processing and Machine Learning , 2016, IEEE Transactions on Signal Processing.

[13]  Johan Håstad,et al.  Tensor Rank is NP-Complete , 1989, ICALP.

[14]  Anima Anandkumar,et al.  Learning Overcomplete Latent Variable Models through Tensor Methods , 2014, COLT.

[15]  Sanjeev Arora,et al.  New Algorithms for Learning Incoherent and Overcomplete Dictionaries , 2013, COLT.

[16]  Pedro A. Valdes-Sosa,et al.  Penalized PARAFAC analysis of spontaneous EEG recordings , 2008 .

[17]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[18]  Martin J. Mohlenkamp Musings on multilinear fitting , 2013 .

[19]  Tengyao Wang,et al.  A useful variant of the Davis--Kahan theorem for statisticians , 2014, 1405.0680.

[20]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[21]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[22]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[23]  Vatsal Sharan,et al.  Orthogonalized ALS: A Theoretically Principled Tensor Decomposition Algorithm for Practical Use , 2017, ICML.

[24]  Kathleen M. Carley,et al.  Exploration of communication networks from the Enron email corpus , 2005 .

[25]  Andrea Lockerd Thomaz,et al.  Cheese: tracking mouse movement activity on websites, a tool for user modeling , 2001, CHI Extended Abstracts.

[26]  N. Sidiropoulos,et al.  On the uniqueness of multilinear decomposition of N‐way arrays , 2000 .

[27]  S. Huffel,et al.  Neonatal seizure localization using PARAFAC decomposition , 2009, Clinical Neurophysiology.

[28]  Christopher J. Hillar,et al.  Most Tensor Problems Are NP-Hard , 2009, JACM.

[29]  Rémi Gribonval,et al.  Brain-Source Imaging: From sparse to tensor models , 2015, IEEE Signal Processing Magazine.

[30]  Zhi-Quan Luo,et al.  A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization , 2012, SIAM J. Optim..

[31]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[32]  Zhi-Quan Luo,et al.  Guaranteed Matrix Completion via Non-Convex Factorization , 2014, IEEE Transactions on Information Theory.

[33]  David Steurer,et al.  Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method , 2014, STOC.

[34]  Tamara G. Kolda,et al.  Shifted Power Method for Computing Tensor Eigenpairs , 2010, SIAM J. Matrix Anal. Appl..

[35]  Quoc V. Le,et al.  ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning , 2011, NIPS.

[36]  I. Daubechies,et al.  An iterative thresholding algorithm for linear inverse problems with a sparsity constraint , 2003, math/0307152.

[37]  Nikos D. Sidiropoulos,et al.  From K-Means to Higher-Way Co-Clustering: Multilinear Decomposition With Sparse Latent Factors , 2013, IEEE Transactions on Signal Processing.

[38]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[39]  Genevera I. Allen,et al.  Sparse Higher-Order Principal Components Analysis , 2012, AISTATS.

[40]  J. Kruskal Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics , 1977 .

[41]  Gene H. Golub,et al.  Rank-One Approximation to High Order Tensors , 2001, SIAM J. Matrix Anal. Appl..

[42]  Fernando Silveira,et al.  Predicting audience responses to movie content from electro-dermal activity signals , 2013, UbiComp.

[43]  Parikshit Shah,et al.  Guaranteed Tensor Decomposition: A Moment Approach , 2015, ICML.

[44]  Martin J. Wainwright,et al.  Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees , 2015, ArXiv.

[45]  Huan Wang,et al.  Exact Recovery of Sparsely-Used Dictionaries , 2012, COLT.

[46]  Jarvis D. Haupt,et al.  NOODL: Provable Online Dictionary Learning and Sparse Coding , 2019, ICLR.

[47]  Julien Penders,et al.  The Design and Analysis of a Real-Time, Continuous Arousal Monitor , 2009, 2009 Sixth International Workshop on Wearable and Implantable Body Sensor Networks.

[48]  Michael Elad,et al.  Stable recovery of sparse overcomplete representations in the presence of noise , 2006, IEEE Transactions on Information Theory.

[49]  Tengyu Ma,et al.  Polynomial-Time Tensor Decompositions with Sum-of-Squares , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[50]  Tamara G. Kolda,et al.  Pattern Analysis of Directed Graphs Using DEDICOM: An Application to Enron Email , 2006 .

[51]  André Uschmajew,et al.  On Convergence of the Maximum Block Improvement Method , 2015, SIAM J. Optim..

[52]  C. McDiarmid Concentration , 1862, The Dental register.

[53]  Terrence J. Sejnowski,et al.  Learning Overcomplete Representations , 2000, Neural Computation.

[54]  Prateek Jain,et al.  Learning Sparsely Used Overcomplete Dictionaries , 2014, COLT.

[55]  Prateek Jain,et al.  Tensor vs. Matrix Methods: Robust Tensor Decomposition under Block Sparse Perturbations , 2015, AISTATS.

[56]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[57]  Anima Anandkumar,et al.  Convolutional Dictionary Learning through Tensor Factorization , 2015, FE@NIPS.