Sparse Principal Component Analysis and Iterative Thresholding

Principal component analysis (PCA) is a classical dimension reduction method which projects data onto the principal subspace spanned by the leading eigenvectors of the covariance matrix. However, it behaves poorly when the number of features p is comparable to, or even much larger than, the sample size n. In this paper, we propose a new iterative thresholding approach for estimating principal subspaces in the setting where the leading eigenvectors are sparse. Under a spiked covariance model, we find that the new approach recovers the principal subspace and leading eigenvectors consistently, and even optimally, in a range of high-dimensional sparse settings. Simulated examples also demonstrate its competitive performance.

[1]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[2]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[3]  P. Hsu ON THE DISTRIBUTION OF ROOTS OF CERTAIN DETERMINANTAL EQUATIONS , 1939 .

[4]  T. W. Anderson ASYMPTOTIC THEORY FOR PRINCIPAL COMPONENT ANALYSIS , 1963 .

[5]  Chandler Davis The rotation of eigenvectors by a perturbation , 1963 .

[6]  W. Kahan,et al.  The Rotation of Eigenvectors by a Perturbation. III , 1970 .

[7]  P. Wedin Perturbation bounds in connection with singular value decomposition , 1972 .

[8]  T. Hastie Principal Curves and Surfaces , 1984 .

[9]  Thomas Kailath,et al.  Detection of signals by information theoretic criteria , 1985, IEEE Trans. Acoust. Speech Signal Process..

[10]  李幼升,et al.  Ph , 1989 .

[11]  V. N. Bogaevski,et al.  Matrix Perturbation Theory , 1991 .

[12]  D. Donoho Unconditional Bases Are Optimal Bases for Data Compression and for Statistical Estimation , 1993 .

[13]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[14]  Geert Jan Bex,et al.  A Gaussian scenario for unsupervised learning , 1996 .

[15]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[16]  Thomas de Quincey [C] , 2000, The Works of Thomas De Quincey, Vol. 1: Writings, 1799–1820.

[17]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[18]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[19]  J. Lindenstrauss,et al.  Handbook of geometry of Banach spaces , 2001 .

[20]  Arthur Yu Lu Sparse principal component analysis for functional data , 2002 .

[21]  Eric R. Ziegel,et al.  Analysis of Financial Time Series , 2002, Technometrics.

[22]  I. Jolliffe,et al.  A Modified Principal Component Technique Based on the LASSO , 2003 .

[23]  M. Rattray,et al.  Principal-component-analysis eigenvalue spectra from data with symmetry-breaking structure. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  D. Paul Nonparametric estimation of principal components , 2005 .

[25]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[26]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, SIAM Rev..

[27]  D. Paul ASYMPTOTICS OF SAMPLE EIGENSTRUCTURE FOR A LARGE DIMENSIONAL SPIKED COVARIANCE MODEL , 2007 .

[28]  Ohad N. Feldheim,et al.  A Universality Result for the Smallest Eigenvalues of Certain Sample Covariance Matrices , 2008, 0812.1961.

[29]  Jianhua Z. Huang,et al.  Sparse principal component analysis via regularized low rank matrix approximation , 2008 .

[30]  Stéphane Mallat,et al.  A Wavelet Tour of Signal Processing - The Sparse Way, 3rd Edition , 2008 .

[31]  Victor Solo,et al.  Dimension Estimation in Noisy PCA With SURE and Random Matrix Theory , 2008, IEEE Transactions on Signal Processing.

[32]  Victor Solo,et al.  Sparse Variable PCA Using Geodesic Steepest Descent , 2008, IEEE Transactions on Signal Processing.

[33]  M. Wainwright,et al.  High-dimensional analysis of semidefinite relaxations for sparse principal components , 2008, 2008 IEEE International Symposium on Information Theory.

[34]  Peter Filzmoser,et al.  Introduction to Multivariate Statistical Analysis in Chemometrics , 2009 .

[35]  I. Johnstone,et al.  On Consistency and Sparsity for Principal Components Analysis in High Dimensions , 2009, Journal of the American Statistical Association.

[36]  R. Tibshirani,et al.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[37]  B. Nadler Finite sample approximation results for principal component analysis: a matrix perturbation approach , 2009, 0901.3245.

[38]  A. Onatski TESTING HYPOTHESES ABOUT THE NUMBER OF FACTORS IN LARGE FACTOR MODELS , 2009 .

[39]  J. Marron,et al.  PCA CONSISTENCY IN HIGH DIMENSION, LOW SAMPLE SIZE CONTEXT , 2009, 0911.3827.

[40]  Boaz Nadler,et al.  Non-Parametric Detection of the Number of Signals: Hypothesis Testing and Random Matrix Theory , 2009, IEEE Transactions on Signal Processing.

[41]  Boaz Nadler,et al.  Nonparametric Detection of Signals by Information Theoretic Criteria: Performance Analysis and an Improved Estimator , 2010, IEEE Transactions on Signal Processing.

[42]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[43]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[44]  I. Johnstone,et al.  Augmented sparse principal component analysis for high dimensional data , 2012, 1202.1242.

[45]  Alexei Onatski,et al.  Asymptotics of the principal components estimator of large factor models with weakly influential factors , 2012 .

[46]  Xiao-Tong Yuan,et al.  Truncated power method for sparse eigenvalue problems , 2011, J. Mach. Learn. Res..

[47]  Dan Shen,et al.  Consistency of sparse PCA in High Dimension, Low Sample Size contexts , 2011, J. Multivar. Anal..