Sparse Unsupervised Dimensionality Reduction Algorithms

Principal component analysis (PCA) and its dual--principal coordinate analysis (PCO)--are widely applied to unsupervised dimensionality reduction. In this paper, we show that PCAand PCOcan be carried out under regression frameworks. Thus, it is convenient to incorporate sparse techniques into the regression frameworks. In particular, we propose a sparse PCA model and a sparse PCO model. The former is to find sparse principal components, while the latter directly calculates sparse principal coordinates in a low-dimensional space. Our models can be solved by simple and efficient iterative procedures. Finally, we discuss the relationship of our models with other existing sparse PCA methods and illustrate empirical comparisons for these sparse unsupervised dimensionality reduction methods. The experimental results are encouraging.

[1]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[2]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[3]  I. Jolliffe Principal Component Analysis , 2002 .

[4]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[5]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[6]  Gene H. Golub,et al.  Matrix computations , 1983 .

[7]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[8]  I. Jolliffe,et al.  A Modified Principal Component Technique Based on the LASSO , 2003 .

[9]  J. N. R. Jeffers,et al.  Two Case Studies in the Application of Principal Component Analysis , 1967 .

[10]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, NIPS 2004.

[11]  Zhihua Zhang,et al.  Optimal Scoring for Unsupervised Learning , 2009, NIPS.

[12]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[13]  J. Magnus,et al.  Matrix Differential Calculus with Applications in Statistics and Econometrics , 2019, Wiley Series in Probability and Statistics.

[14]  R. Tibshirani,et al.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[15]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[16]  J. C. Gower,et al.  Projection Procrustes problems , 2004 .

[17]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[18]  Jianhua Z. Huang,et al.  Sparse principal component analysis via regularized low rank matrix approximation , 2008 .

[19]  R. Tibshirani,et al.  Flexible Discriminant Analysis by Optimal Scoring , 1994 .

[20]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[21]  Trevor J. Hastie,et al.  Sparse Discriminant Analysis , 2011, Technometrics.

[22]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[23]  J. Gower Some distance properties of latent root and vector methods used in multivariate analysis , 1966 .

[24]  Gert R. G. Lanckriet,et al.  Sparse eigen methods by D.C. programming , 2007, ICML '07.

[25]  Haesun Park,et al.  A Procrustes problem on the Stiefel manifold , 1999, Numerische Mathematik.