Sparse Higher-Order Principal Components Analysis

Traditional tensor decompositions such as the CANDECOMP / PARAFAC (CP) and Tucker decompositions yield higher-order principal components that have been used to understand tensor data in areas such as neuroimaging, microscopy, chemometrics, and remote sensing. Sparsity in high-dimensional matrix factorizations and principal components has been well-studied exhibiting many benefits; less attention has been given to sparsity in tensor decompositions. We propose two novel tensor decompositions that incorporate sparsity: the Sparse Higher-Order SVD and the Sparse CP Decomposition. The latter solves an `1-norm penalized relaxation of the single-factor CP optimization problem, thereby automatically selecting relevant features for each tensor factor. Through experiments and a scientific data analysis example, we demonstrate the utility of our methods for dimension reduction, feature selection, signal recovery, and exploratory data analysis of high-dimensional tensors.

[1]  Jieping Ye,et al.  Sparse non-negative tensor factorization using columnwise coordinate descent , 2012, Pattern Recognit..

[2]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[3]  Patrick O. Perry,et al.  Bi-cross-validation of the SVD and the nonnegative matrix factorization , 2009, 0908.2062.

[4]  Jianhua Z. Huang,et al.  Biclustering via Sparse Singular Value Decomposition , 2010, Biometrics.

[5]  Genevera I. Allen,et al.  A Generalized Least-Square Matrix Decomposition , 2014 .

[6]  Tamara G. Kolda,et al.  MATLAB Tensor Toolbox , 2006 .

[7]  I. Jolliffe,et al.  A Modified Principal Component Technique Based on the LASSO , 2003 .

[8]  Genevera I. Allen,et al.  A Generalized Least Squares Matrix Decomposition , 2011, 1102.3074.

[9]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[10]  Lars Kai Hansen,et al.  Algorithms for Sparse Nonnegative Tucker Decompositions , 2008, Neural Computation.

[11]  Yurii Nesterov,et al.  Generalized Power Method for Sparse Principal Component Analysis , 2008, J. Mach. Learn. Res..

[12]  A. Owen,et al.  AGEMAP: A Gene Expression Database for Aging in Mice , 2007, PLoS genetics.

[13]  Genevera I. Allen,et al.  Sparse non-negative generalized PCA with applications to metabolomics , 2011, Bioinform..

[14]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[15]  Tamir Hazan,et al.  Sparse image coding using a 3D non-negative tensor factorization , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[16]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[17]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[18]  Jianhua Z. Huang,et al.  Sparse principal component analysis via regularized low rank matrix approximation , 2008 .

[19]  Reinhard Klein,et al.  BTF Compression via Sparse Tensor Decomposition , 2009, Comput. Graph. Forum.

[20]  R. Tibshirani,et al.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[21]  I. Jolliffe Principal Component Analysis , 2002 .

[22]  I. Johnstone,et al.  On Consistency and Sparsity for Principal Components Analysis in High Dimensions , 2009, Journal of the American Statistical Association.

[23]  M. Wainwright,et al.  High-dimensional analysis of semidefinite relaxations for sparse principal components , 2008, 2008 IEEE International Symposium on Information Theory.

[24]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[25]  Jing Pan,et al.  Robust Sparse Tensor Decomposition by Probabilistic Latent Semantic Analysis , 2011, 2011 Sixth International Conference on Image and Graphics.

[26]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[27]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[28]  Andrzej Cichocki,et al.  Nonnegative Matrix and Tensor Factorization T , 2007 .

[29]  Tamir Hazan,et al.  Non-negative tensor factorization with applications to statistics and computer vision , 2005, ICML.

[30]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .