On Tensors, Sparsity, and Nonnegative Factorizations

Tensors have found application in a variety of fields, ranging from chemometrics to signal processing and beyond. In this paper, we consider the problem of multilinear modeling of sparse count data. Our goal is to develop a descriptive tensor factorization model of such data, along with appropriate algorithms and theory. To do so, we propose that the random variation is best described via a Poisson distribution, which better describes the zeros observed in the data as compared to the typical assumption of a Gaussian distribution. Under a Poisson assumption, we fit a model to observed data using the negative log-likelihood score. We present a new algorithm for Poisson tensor factorization called CANDECOMP--PARAFAC alternating Poisson regression (CP-APR) that is based on a majorization-minimization approach. It can be shown that CP-APR is a generalization of the Lee--Seung multiplicative updates. We show how to prevent the algorithm from converging to non-KKT points and prove convergence of CP-APR under mil...

[1]  E. M. L. Beale,et al.  Nonlinear Programming: A Unified Approach. , 1970 .

[2]  Yingjie Zhou,et al.  Strategies for Cleaning Organizational Emails with an Application to Enron Email Dataset , 2007 .

[3]  Hua Zhou,et al.  A quasi-Newton acceleration for high-dimensional optimization algorithms , 2011, Stat. Comput..

[4]  Jimeng Sun,et al.  Beyond streams and graphs: dynamic tensor analysis , 2006, KDD '06.

[5]  Poisson Models for Count Data , .

[6]  Tamara G. Kolda,et al.  Scalable Tensor Factorizations for Incomplete Data , 2010, ArXiv.

[7]  L. Finesso,et al.  Nonnegative matrix factorization and I-divergence alternating minimization☆ , 2004, math/0412070.

[8]  D. Fitzgerald,et al.  Non-negative Tensor Factorisation for Sound Source Separation , 2005 .

[9]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[10]  Andrew W. Fitzgibbon,et al.  Damped Newton algorithms for matrix factorization with missing data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  Haesun Park,et al.  Fast Nonnegative Matrix Factorization: An Active-Set-Like Method and Comparisons , 2011, SIAM J. Sci. Comput..

[12]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[13]  Hyunsoo Kim,et al.  Non-negative Tensor Factorization Based on Alternating Large-scale Non-negativity-constrained Least Squares , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[14]  Inderjit S. Dhillon,et al.  Generalized Nonnegative Matrix Approximations with Bregman Divergences , 2005, NIPS.

[15]  LinLin Shen,et al.  Sparse nonnegative matrix factorization with the elastic net , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[16]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[17]  Max Welling,et al.  Positive tensor factorization , 2001, Pattern Recognit. Lett..

[18]  K. Lange Convergence of EM image reconstruction algorithms with Gibbs smoothing. , 1990, IEEE transactions on medical imaging.

[19]  Jérôme Idier,et al.  Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.

[20]  Inderjit S. Dhillon,et al.  Fast Projection‐Based Methods for the Least Squares Nonnegative Matrix Approximation Problem , 2008, Stat. Anal. Data Min..

[21]  Tamara G. Kolda,et al.  Multilinear Algebra for Analyzing Data with Multiple Linkages , 2006, Graph Algorithms in the Language of Linear Algebra.

[22]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[23]  R. Bro,et al.  A fast non‐negativity‐constrained least squares algorithm , 1997 .

[24]  K. Lange,et al.  EM reconstruction algorithms for emission and transmission tomography. , 1984, Journal of computer assisted tomography.

[25]  Pierre Comon,et al.  Nonnegative approximations of nonnegative tensors , 2009, ArXiv.

[26]  Hyunsoo Kim,et al.  Nonnegative Matrix Factorization Based on Alternating Nonnegativity Constrained Least Squares and Active Set Method , 2008, SIAM J. Matrix Anal. Appl..

[27]  Chih-Jen Lin,et al.  On the Convergence of Multiplicative Update Algorithms for Nonnegative Matrix Factorization , 2007, IEEE Transactions on Neural Networks.

[28]  Inderjit S. Dhillon,et al.  Tackling Box-Constrained Optimization via a New Projected Quasi-Newton Approach , 2010, SIAM J. Sci. Comput..

[29]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[30]  R. Steele Optimization , 2005 .

[31]  R. Bro,et al.  PARAFAC and missing values , 2005 .

[32]  Jieping Ye,et al.  Sparse non-negative tensor factorization using columnwise coordinate descent , 2012, Pattern Recognit..

[33]  Hualou Liang,et al.  Single-Trial Decoding of Bistable Perception Based on Sparse Nonnegative Tensor Decomposition , 2008, Comput. Intell. Neurosci..

[34]  Lars Kai Hansen,et al.  Decomposing the time-frequency representation of EEG using non-negative matrix and multi-way factorization , 2006 .

[35]  Yin Zhang,et al.  Accelerating the Lee-Seung Algorithm for Nonnegative Matrix Factorization , 2005 .

[36]  Nicolas Gillis,et al.  Nonnegative Factorization and The Maximum Edge Biclique Problem , 2008, 0810.4225.

[37]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[38]  Tamara G. Kolda,et al.  Temporal Link Prediction Using Matrix and Tensor Factorizations , 2010, TKDD.

[39]  Andrzej Cichocki,et al.  Non-Negative Tensor Factorization using Alpha and Beta Divergences , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[40]  Tamara G. Kolda,et al.  Efficient MATLAB Computations with Sparse and Factored Tensors , 2007, SIAM J. Sci. Comput..

[41]  L. Lucy An iterative technique for the rectification of observed distributions , 1974 .

[42]  L. Shepp,et al.  Maximum Likelihood Reconstruction for Emission Tomography , 1983, IEEE Transactions on Medical Imaging.

[43]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[44]  M. Friedlander,et al.  Computing non-negative tensor factorizations , 2008, Optim. Methods Softw..

[45]  Stefanos Zafeiriou,et al.  Nonnegative tensor factorization as an alternative Csiszar–Tusnady procedure: algorithms, convergence, probabilistic interpretations and novel probabilistic tensor latent variable analysis algorithms , 2011, Data Mining and Knowledge Discovery.

[46]  Nicolas Gillis,et al.  Accelerated Multiplicative Updates and Hierarchical ALS Algorithms for Nonnegative Matrix Factorization , 2011, Neural Computation.

[47]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[48]  Patrick J. Wolfe,et al.  Point process modelling for directed interaction networks , 2010, ArXiv.

[49]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[50]  Michael W. Berry,et al.  Discussion Tracking in Enron Email using PARAFAC. , 2008 .

[51]  Lars Kai Hansen,et al.  Algorithms for Sparse Nonnegative Tucker Decompositions , 2008, Neural Computation.

[52]  P. Paatero A weighted non-negative least squares algorithm for three-way ‘PARAFAC’ factor analysis , 1997 .

[53]  Vin de Silva,et al.  Tensor rank and the ill-posedness of the best low-rank approximation problem , 2006, math/0607647.

[54]  Michael P. Friedlander,et al.  Computing non-negative tensor factorizations , 2008, Optim. Methods Softw..

[55]  H. Kiers Weighted least squares fitting using ordinary least squares algorithms , 1997 .

[56]  David E. Booth,et al.  Multi-Way Analysis: Applications in the Chemical Sciences , 2005, Technometrics.