Accelerating Block Coordinate Descent for Nonnegative Tensor Factorization

This paper is concerned with improving the empirical convergence speed of block-coordinate descent algorithms for approximate nonnegative tensor factorization (NTF). We propose an extrapolation strategy in-between block updates, referred to as heuristic extrapolation with restarts (HER). HER significantly accelerates the empirical convergence speed of most existing block-coordinate algorithms for dense NTF, in particular for challenging computational scenarios, while requiring a negligible additional computational budget.

[1]  Laurent Albera,et al.  Semi-algebraic canonical decomposition of multi-way arrays and Joint Eigenvalue Decomposition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Haesun Park,et al.  Algorithms for nonnegative matrix and tensor factorizations: a unified view based on block coordinate descent framework , 2014, J. Glob. Optim..

[3]  Saeed Ghadimi,et al.  Accelerated gradient methods for nonconvex nonlinear and stochastic programming , 2013, Mathematical Programming.

[4]  Nicolas Gillis,et al.  Inertial Block Mirror Descent Method for Non-Convex Non-Smooth Optimization , 2019, 1903.01818.

[5]  Nicolas Gillis,et al.  Accelerating Approximate Nonnegative Canonical Polyadic Decomposition using Extrapolation , 2019 .

[6]  Rasmus Bro,et al.  MULTI-WAY ANALYSIS IN THE FOOD INDUSTRY Models, Algorithms & Applications , 1998 .

[7]  Ramakrishnan Kannan,et al.  Parallel Nonnegative CP Decomposition of Dense Tensors , 2018, 2018 IEEE 25th International Conference on High Performance Computing (HiPC).

[8]  Nicolas Gillis,et al.  Accelerated Multiplicative Updates and Hierarchical ALS Algorithms for Nonnegative Matrix Factorization , 2011, Neural Computation.

[9]  Wing-Kin Ma,et al.  Nonnegative Matrix Factorization for Signal and Data Analytics: Identifiability, Algorithms, and Applications , 2018, IEEE Signal Processing Magazine.

[10]  Lieven De Lathauwer,et al.  Canonical Polyadic Decomposition of Third-Order Tensors: Reduction to Generalized Eigenvalue Decomposition , 2013, SIAM J. Matrix Anal. Appl..

[11]  Nikos D. Sidiropoulos,et al.  Tensor Decomposition for Signal Processing and Machine Learning , 2016, IEEE Transactions on Signal Processing.

[12]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[13]  Joos Vandewalle,et al.  Computation of the Canonical Decomposition by Means of a Simultaneous Generalized Schur Decomposition , 2005, SIAM J. Matrix Anal. Appl..

[14]  P. Paatero A weighted non-negative least squares algorithm for three-way ‘PARAFAC’ factor analysis , 1997 .

[15]  R. Cattell The three basic factor-analytic research designs-their interrelations and derivatives. , 1952, Psychological bulletin.

[16]  F. L. Hitchcock Multiple Invariants and Generalized Rank of a P‐Way Matrix or Tensor , 1928 .

[17]  Max Welling,et al.  Positive tensor factorization , 2001, Pattern Recognit. Lett..

[18]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[19]  Pierre Comon,et al.  Nonnegative approximations of nonnegative tensors , 2009, ArXiv.

[20]  Nick Vannieuwenhoven Condition numbers for the tensor rank decomposition , 2016, 1604.00052.

[21]  Shmuel Friedland,et al.  On the generic and typical ranks of 3-tensors , 2008, 0805.3777.

[22]  Pierre Comon,et al.  Uniqueness of Nonnegative Tensor Approximations , 2014, IEEE Transactions on Information Theory.

[23]  Tamara G. Kolda,et al.  On Tensors, Sparsity, and Nonnegative Factorizations , 2011, SIAM J. Matrix Anal. Appl..

[24]  Joe Brewer,et al.  Kronecker products and matrix calculus in system theory , 1978 .

[25]  Nikos D. Sidiropoulos,et al.  Memory-efficient parallel computation of tensor and matrix products for big tensor decomposition , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[26]  Xingyu Wang,et al.  Fast nonnegative tensor factorization based on accelerated proximal gradient and low-rank approximation , 2016, Neurocomputing.

[27]  Tamara G. Kolda,et al.  Efficient MATLAB Computations with Sparse and Factored Tensors , 2007, SIAM J. Sci. Comput..

[28]  Pierre Comon,et al.  Nonnegative 3-way tensor factorization taking in to account possible missing data , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[29]  Nicolas Gillis,et al.  Accelerating Nonnegative Matrix Factorization Algorithms Using Extrapolation , 2018, Neural Computation.

[30]  F. L. Hitchcock The Expression of a Tensor or a Polyadic as a Sum of Products , 1927 .

[31]  Stephen P. Boyd,et al.  A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights , 2014, J. Mach. Learn. Res..

[32]  Gordon Wetzstein,et al.  Tensor displays , 2012, SIGGRAPH '12.

[33]  D. Fitzgerald,et al.  Non-negative Tensor Factorisation for Sound Source Separation , 2005 .

[34]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[35]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[36]  Andrzej Cichocki,et al.  Tensor Decompositions for Signal Processing Applications: From two-way to multiway component analysis , 2014, IEEE Signal Processing Magazine.

[37]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[38]  Michael W. Berry,et al.  Discussion Tracking in Enron Email using PARAFAC. , 2008 .

[39]  Tamir Hazan,et al.  Sparse image coding using a 3D non-negative tensor factorization , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[40]  Carlos Beltrán,et al.  Pencil-Based Algorithms for Tensor Rank Decomposition are not Stable , 2018, SIAM J. Matrix Anal. Appl..

[41]  Nicolas Gillis,et al.  Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization , 2019, ICML.

[42]  Giorgio Ottaviani,et al.  On Generic Identifiability of 3-Tensors of Small Rank , 2011, SIAM J. Matrix Anal. Appl..

[43]  L. Lathauwer,et al.  From Matrix to Tensor : Multilinear Algebra and Signal Processing , 1996 .

[44]  Nuno Vasconcelos,et al.  Anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[45]  H. Kiers Towards a standardized notation and terminology in multiway analysis , 2000 .

[46]  Jeremy E. Cohen About Notations in Multiway Array Processing , 2015, ArXiv.

[47]  R. Cattell “Parallel proportional profiles” and other principles for determining the choice of factors by rotation , 1944 .

[48]  Athanasios P. Liavas,et al.  Nesterov-Based Alternating Optimization for Nonnegative Tensor Factorization: Algorithm and Parallel Implementation , 2018, IEEE Transactions on Signal Processing.

[49]  Nikos D. Sidiropoulos,et al.  A Flexible and Efficient Algorithmic Framework for Constrained Matrix and Tensor Factorization , 2015, IEEE Transactions on Signal Processing.

[50]  Zhi-Quan Luo,et al.  A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization , 2012, SIAM J. Optim..

[51]  Michael P. Friedlander,et al.  Computing non-negative tensor factorizations , 2008, Optim. Methods Softw..

[52]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[53]  Vin de Silva,et al.  Tensor rank and the ill-posedness of the best low-rank approximation problem , 2006, math/0607647.

[54]  Martin Stoll,et al.  Interior-point methods and preconditioning for PDE-constrained optimization problems involving sparsity terms , 2020, Numer. Linear Algebra Appl..

[55]  P. Kroonenberg Applied Multiway Data Analysis , 2008 .

[56]  Stephen A. Vavasis,et al.  On the Complexity of Nonnegative Matrix Factorization , 2007, SIAM J. Optim..

[57]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[58]  Nicolas Gillis,et al.  The Why and How of Nonnegative Matrix Factorization , 2014, ArXiv.

[59]  Rasmus Bro,et al.  Multi-way Analysis with Applications in the Chemical Sciences , 2004 .

[60]  Nico Vervliet,et al.  Exploiting Efficient Representations in Large-Scale Tensor Decompositions , 2019, SIAM J. Sci. Comput..

[61]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[62]  J. Chisholm Approximation by Sequences of Padé Approximants in Regions of Meromorphy , 1966 .

[63]  Tamir Hazan,et al.  Non-negative tensor factorization with applications to statistics and computer vision , 2005, ICML.

[64]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[65]  Wotao Yin,et al.  A Block Coordinate Descent Method for Regularized Multiconvex Optimization with Applications to Nonnegative Tensor Factorization and Completion , 2013, SIAM J. Imaging Sci..

[66]  Xue Gong,et al.  The Optimization Landscape for Fitting a Rank-2 Tensor with a Rank-1 Tensor , 2018, SIAM J. Appl. Dyn. Syst..

[67]  Hans De Sterck,et al.  Nesterov acceleration of alternating least squares for canonical tensor decomposition: Momentum step size selection and restart mechanisms , 2018, Numer. Linear Algebra Appl..

[68]  Daniel M. Dunlavy,et al.  A scalable optimization approach for fitting canonical tensor decompositions , 2011 .

[69]  Ivan Oseledets,et al.  Tensor-Train Decomposition , 2011, SIAM J. Sci. Comput..

[70]  Zhigang Luo,et al.  NeNMF: An Optimal Gradient Method for Nonnegative Matrix Factorization , 2012, IEEE Transactions on Signal Processing.

[71]  Tamara G. Kolda,et al.  A Practical Randomized CP Tensor Decomposition , 2017, SIAM J. Matrix Anal. Appl..

[72]  George Karypis,et al.  Tensor-matrix products with a compressed sparse tensor , 2015, IA3@SC.

[73]  Michael Greenacre,et al.  Multiway data analysis , 1992 .

[74]  W. Hackbusch Tensor Spaces and Numerical Tensor Calculus , 2012, Springer Series in Computational Mathematics.

[75]  Morten Mørup,et al.  Applications of tensor (multiway array) factorizations and decompositions in data mining , 2011, WIREs Data Mining Knowl. Discov..

[76]  R. Bro,et al.  A fast non‐negativity‐constrained least squares algorithm , 1997 .