Asymptotic performance of PCA for high-dimensional heteroscedastic data

Principal Component Analysis (PCA) is a classical method for reducing the dimensionality of data by projecting them onto a subspace that captures most of their variation. Effective use of PCA in modern applications requires understanding its performance for data that are both high-dimensional and heteroscedastic. This paper analyzes the statistical performance of PCA in this setting, i.e., for high-dimensional data drawn from a low-dimensional subspace and degraded by heteroscedastic noise. We provide simplified expressions for the asymptotic PCA recovery of the underlying subspace, subspace amplitudes and subspace coefficients; the expressions enable both easy and efficient calculation and reasoning about the performance of PCA. We exploit the structure of these expressions to show that, for a fixed average noise variance, the asymptotic recovery of PCA for heteroscedastic data is always worse than that for homoscedastic data (i.e., for noise variances that are equal across samples). Hence, while average noise variance is often a practically convenient measure for the overall quality of data, it gives an overly optimistic estimate of the performance of PCA for heteroscedastic data.

[1]  Alan Edelman,et al.  The Polynomial Method for Random Matrices , 2008, Found. Comput. Math..

[2]  Z. Bai,et al.  Large Sample Covariance Matrices and High-Dimensional Data Analysis , 2015 .

[3]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[4]  I. Jolliffe Principal Component Analysis , 2002 .

[5]  I. Johnstone,et al.  On Consistency and Sparsity for Principal Components Analysis in High Dimensions , 2009, Journal of the American Statistical Association.

[6]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[7]  Tsevi Mazeh,et al.  Correcting systematic effects in a large set of photometric light curves , 2005, astro-ph/0502056.

[8]  R. Cochran,et al.  Statistically weighted principal component analysis of rapid scanning wavelength kinetics experiments , 1977 .

[9]  Kay Nehrke,et al.  k‐t PCA: Temporally constrained k‐t BLAST reconstruction using principal component analysis , 2009, Magnetic resonance in medicine.

[10]  Namrata Vaswani,et al.  Recursive Robust PCA or Recursive Sparse Recovery in Large but Structured Noise , 2012, IEEE Transactions on Information Theory.

[11]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[12]  Namrata Vaswani,et al.  Correlated-PCA: Principal Components' Analysis when Data and Noise are Correlated , 2016, NIPS.

[13]  I. Johnstone,et al.  Statistical challenges of high-dimensional data , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[14]  B. Nadler Finite sample approximation results for principal component analysis: a matrix perturbation approach , 2009, 0901.3245.

[15]  Christophe Croux,et al.  High breakdown estimators for principal components: the projection-pursuit approach revisited , 2005 .

[16]  A. Guionnet,et al.  An Introduction to Random Matrices , 2009 .

[17]  Mark Crovella,et al.  Diagnosing network-wide traffic anomalies , 2004, SIGCOMM '04.

[18]  Joel A. Tropp,et al.  Robust Computation of Linear Models by Convex Relaxation , 2012, Foundations of Computational Mathematics.

[19]  Jun He,et al.  Online Robust Background Modeling via Alternating Grassmannian Optimization , 2014 .

[20]  Jimeng Sun,et al.  Streaming Pattern Discovery in Multiple Time-Series , 2005, VLDB.

[21]  Jeffrey A. Fessler,et al.  Towards a theoretical analysis of PCA for heteroscedastic data , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[22]  Laura Balzano,et al.  Incremental gradient on the Grassmannian for online foreground and background separation in subsampled video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  S. Chatterjee,et al.  Matrix estimation by Universal Singular Value Thresholding , 2012, 1212.1247.

[24]  Raj Rao Nadakuditi,et al.  OptShrink: An Algorithm for Improved Low-Rank Signal Matrix Denoising by Optimal, Data-Driven Singular Value Shrinkage , 2013, IEEE Transactions on Information Theory.

[25]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[26]  Noureddine El Karoui,et al.  Operator norm consistent estimation of large-dimensional sparse covariance matrices , 2008, 0901.3220.

[27]  Iwao Kanno,et al.  Activation detection in functional MRI using subspace modeling and maximum likelihood estimation , 1999, IEEE Transactions on Medical Imaging.

[28]  S. J. Devlin,et al.  Robust Estimation of Dispersion Matrices and Principal Components , 1981 .

[29]  Namrata Vaswani,et al.  Online (and Offline) Robust PCA: Novel Algorithms and Performance Guarantees , 2016, AISTATS.

[30]  P. Bickel,et al.  Covariance regularization by thresholding , 2009, 0901.3079.

[31]  Michael Biehl,et al.  Statistical mechanics of unsupervised structure recognition , 1994 .

[32]  Kriti Saroha,et al.  A novel dimensionality reduction method for cancer dataset using PCA and Feature Ranking , 2015, 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[33]  D. Paul ASYMPTOTICS OF SAMPLE EIGENSTRUCTURE FOR A LARGE DIMENSIONAL SPIKED COVARIANCE MODEL , 2007 .

[34]  Jianfeng Yao,et al.  On sample eigenvalues in a generalized spiked population model , 2008, J. Multivar. Anal..

[35]  Guangming Pan,et al.  Strong convergence of the empirical distribution of eigenvalues of sample covariance matrices with a perturbation matrix , 2010, J. Multivar. Anal..

[36]  Gregory S. Wagner,et al.  Signal detection using multi-channel seismic data , 1996 .

[37]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[38]  J. Leek Asymptotic Conditional Singular Value Decomposition for High‐Dimensional Genomic Data , 2011, Biometrics.

[39]  Constantine Caramanis,et al.  Robust PCA via Outlier Pursuit , 2010, IEEE Transactions on Information Theory.

[40]  Gene H. Golub,et al.  Matrix computations , 1983 .

[41]  Raj Rao Nadakuditi,et al.  The singular values and vectors of low rank perturbations of large rectangular random matrices , 2011, J. Multivar. Anal..