A Randomized Block Sampling Approach to Canonical Polyadic Decomposition of Large-Scale Tensors

For the analysis of large-scale datasets one often assumes simple structures. In the case of tensors, a decomposition in a sum of rank-1 terms provides a compact and informative model. Finding this decomposition is intrinsically more difficult than its matrix counterpart. Moreover, for large-scale tensors, computational difficulties arise due to the curse of dimensionality. The randomized block sampling canonical polyadic decomposition method presented here combines increasingly popular ideas from randomization and stochastic optimization to tackle the computational problems. Instead of decomposing the full tensor at once, updates are computed from small random block samples. Using step size restriction the decomposition can be found up to near optimal accuracy, while reducing the computation time and number of data accesses significantly. The scalability is illustrated by the decomposition of a synthetic 8 TB tensor and a real life 12.5 GB tensor in a few minutes on a standard laptop.

[1]  Michael W. Mahoney Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..

[2]  Lieven De Lathauwer,et al.  On the Uniqueness of the Canonical Polyadic Decomposition of Third-Order Tensors - Part II: Uniqueness of the Overall Decomposition , 2013, SIAM J. Matrix Anal. Appl..

[3]  Bin Wu,et al.  A Fast Distributed Stochastic Gradient Descent Algorithm for Matrix Factorization , 2014, BigMine.

[4]  M. Zakai,et al.  Some Classes of Global Cramer-Rao Bounds , 1987 .

[5]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[6]  W. Hackbusch Tensor Spaces and Numerical Tensor Calculus , 2012, Springer Series in Computational Mathematics.

[7]  Andrzej Cichocki,et al.  Low Complexity Damped Gauss-Newton Algorithms for CANDECOMP/PARAFAC , 2012, SIAM J. Matrix Anal. Appl..

[8]  Andrzej Cichocki,et al.  PARAFAC algorithms for large-scale problems , 2011, Neurocomputing.

[9]  Rasmus Bro,et al.  A comparison of algorithms for fitting the PARAFAC model , 2006, Comput. Stat. Data Anal..

[10]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[11]  Tamara G. Kolda,et al.  Efficient MATLAB Computations with Sparse and Factored Tensors , 2007, SIAM J. Sci. Comput..

[12]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[13]  Yurii Nesterov,et al.  Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..

[14]  Volkan Cevher,et al.  Convex Optimization for Big Data: Scalable, randomized, and parallel algorithms for big data analytics , 2014, IEEE Signal Processing Magazine.

[15]  P. Paatero A weighted non-negative least squares algorithm for three-way ‘PARAFAC’ factor analysis , 1997 .

[16]  Pierre Comon,et al.  Handbook of Blind Source Separation: Independent Component Analysis and Applications , 2010 .

[17]  Vin de Silva,et al.  Tensor rank and the ill-posedness of the best low-rank approximation problem , 2006, math/0607647.

[18]  Nikolai F. Rulkov,et al.  On the performance of gas sensor arrays in open sampling systems using Inhibitory Support Vector Machines , 2013 .

[19]  D. Bertsekas,et al.  Convergen e Rate of In remental Subgradient Algorithms , 2000 .

[20]  Furong Huang,et al.  Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.

[21]  H. Robbins A Stochastic Approximation Method , 1951 .

[22]  R. Bro,et al.  PARAFAC and missing values , 2005 .

[23]  Zbynek Koldovský,et al.  Cramér-Rao-Induced Bounds for CANDECOMP/PARAFAC Tensor Decomposition , 2012, IEEE Transactions on Signal Processing.

[24]  Bruce R. Kowalski,et al.  Generalized rank annihilation factor analysis , 1986 .

[25]  Lieven De Lathauwer,et al.  Structured Data Fusion , 2015, IEEE Journal of Selected Topics in Signal Processing.

[26]  Caroline Chaux,et al.  A New Stochastic Optimization Algorithm to Decompose Large Nonnegative Tensors , 2015, IEEE Signal Processing Letters.

[27]  Nikos D. Sidiropoulos,et al.  Parallel Algorithms for Constrained Tensor Factorization via Alternating Direction Method of Multipliers , 2014, IEEE Transactions on Signal Processing.

[28]  N. Sidiropoulos,et al.  On the uniqueness of multilinear decomposition of N‐way arrays , 2000 .

[29]  Kijung Shin,et al.  Distributed Methods for High-Dimensional and Large-Scale Tensor Factorization , 2014, 2014 IEEE International Conference on Data Mining.

[30]  Daniel M. Dunlavy,et al.  A scalable optimization approach for fitting canonical tensor decompositions , 2011 .

[31]  S. Leurgans,et al.  A Decomposition for Three-Way Arrays , 1993, SIAM J. Matrix Anal. Appl..

[32]  Daniel Kressner,et al.  A literature survey of low‐rank tensor approximation techniques , 2013, 1302.7121.

[33]  David E. Booth,et al.  Multi-Way Analysis: Applications in the Chemical Sciences , 2005, Technometrics.

[34]  J. Kruskal Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics , 1977 .

[35]  Yann LeCun,et al.  Improving the convergence of back-propagation learning with second-order methods , 1989 .

[36]  Nikos D. Sidiropoulos,et al.  Parallel Randomly Compressed Cubes : A scalable distributed architecture for big tensor decomposition , 2014, IEEE Signal Processing Magazine.

[37]  Nikos D. Sidiropoulos,et al.  ParCube: Sparse Parallelizable Tensor Decompositions , 2012, ECML/PKDD.

[38]  Nikos D. Sidiropoulos,et al.  Cramer-Rao lower bounds for low-rank decomposition of multidimensional arrays , 2001, IEEE Trans. Signal Process..

[39]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[40]  Lieven De Lathauwer,et al.  Canonical Polyadic Decomposition of Third-Order Tensors: Reduction to Generalized Eigenvalue Decomposition , 2013, SIAM J. Matrix Anal. Appl..

[41]  Christos Faloutsos,et al.  GigaTensor: scaling tensor analysis up by 100 times - algorithms and discoveries , 2012, KDD.

[42]  Patrick Gallinari,et al.  SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent , 2009, J. Mach. Learn. Res..

[43]  Andrzej Cichocki,et al.  Tensor Decompositions for Signal Processing Applications: From two-way to multiway component analysis , 2014, IEEE Signal Processing Magazine.

[44]  Simon Günter,et al.  A Stochastic Quasi-Newton Method for Online Convex Optimization , 2007, AISTATS.

[45]  Nico Vervliet,et al.  Breaking the Curse of Dimensionality Using Decompositions of Incomplete Tensors: Tensor-based scientific computing in big data analysis , 2014, IEEE Signal Processing Magazine.

[46]  Rasmus Bro,et al.  Multi-way Analysis with Applications in the Chemical Sciences , 2004 .

[47]  Lieven De Lathauwer,et al.  A Link between the Canonical Decomposition in Multilinear Algebra and Simultaneous Matrix Diagonalization , 2006, SIAM J. Matrix Anal. Appl..

[48]  Lieven De Lathauwer,et al.  Optimization-Based Algorithms for Tensor Decompositions: Canonical Polyadic Decomposition, Decomposition in Rank-(Lr, Lr, 1) Terms, and a New Generalization , 2013, SIAM J. Optim..

[49]  S. Thomas Alexander,et al.  Adaptive Signal Processing , 1986, Texts and Monographs in Computer Science.

[50]  Peter J. Haas,et al.  Large-scale matrix factorization with distributed stochastic gradient descent , 2011, KDD.

[51]  Pierre Comon,et al.  Fast Decomposition of Large Nonnegative Tensors , 2015, IEEE Signal Processing Letters.

[52]  Nikos D. Sidiropoulos,et al.  Parallel factor analysis in sensor array processing , 2000, IEEE Trans. Signal Process..

[53]  R. Bro PARAFAC. Tutorial and applications , 1997 .

[54]  B. Khoromskij Tensors-structured Numerical Methods in Scientific Computing: Survey on Recent Advances , 2012 .