Correlated-PCA: Principal Components' Analysis when Data and Noise are Correlated

Given a matrix of observed data, Principal Components Analysis (PCA) computes a small number of orthogonal directions that contain most of its variability. Provably accurate solutions for PCA have been in use for a long time. However, to the best of our knowledge, all existing theoretical guarantees for it assume that the data and the corrupting noise are mutually independent, or at least uncorrelated. This is valid in practice often, but not always. In this paper, we study the PCA problem in the setting where the data and noise can be correlated. Such noise is often also referred to as "data-dependent noise". We obtain a correctness result for the standard eigenvalue decomposition (EVD) based solution to PCA under simple assumptions on the data-noise correlation. We also develop and analyze a generalization of EVD, cluster-EVD, that improves upon EVD in certain regimes.

[1]  Namrata Vaswani,et al.  Online (and Offline) Robust PCA: Novel Algorithms and Performance Guarantees , 2016, AISTATS.

[2]  Matti Pirinen,et al.  Multiple Output Regression with Latent Noise , 2014, J. Mach. Learn. Res..

[3]  Christos Boutsidis,et al.  Online Principal Components Analysis , 2015, SODA.

[4]  Ohad Shamir,et al.  A Stochastic PCA and SVD Algorithm with an Exponential Convergence Rate , 2014, ICML.

[5]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[6]  Nathan Srebro,et al.  Stochastic Optimization of PCA with Capped MSG , 2013, NIPS.

[7]  Junfeng Yang,et al.  Alternating Direction Algorithms for 1-Problems in Compressive Sensing , 2009, SIAM J. Sci. Comput..

[8]  W. Kahan,et al.  The Rotation of Eigenvectors by a Perturbation. III , 1970 .

[9]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[10]  Prateek Jain,et al.  Low-rank matrix completion using alternating minimization , 2012, STOC '13.

[11]  Sanjoy Dasgupta,et al.  The Fast Convergence of Incremental PCA , 2013, NIPS.

[12]  Sham M. Kakade,et al.  Robust Matrix Decomposition With Sparse Corruptions , 2011, IEEE Transactions on Information Theory.

[13]  Constantine Caramanis,et al.  Robust PCA via Outlier Pursuit , 2010, IEEE Transactions on Information Theory.

[14]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[15]  Namrata Vaswani,et al.  Recursive robust PCA or recursive sparse recovery in large but structured noise , 2013, ICASSP.

[16]  Zohar S. Karnin,et al.  Online {PCA} with Spectral Bounds , 2015 .

[17]  Prateek Jain,et al.  Non-convex Robust PCA , 2014, NIPS.

[18]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[19]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.

[20]  B. Nadler Finite sample approximation results for principal component analysis: a matrix perturbation approach , 2009, 0901.3245.

[21]  G. Golub,et al.  Eigenvalue computation in the 20th century , 2000 .

[22]  Yoram Bresler,et al.  ADMiRA: Atomic Decomposition for Minimum Rank Approximation , 2009, IEEE Transactions on Information Theory.

[23]  A. H. Bentbib,et al.  Block Power Method for SVD Decomposition , 2015 .

[24]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[25]  Namrata Vaswani,et al.  Real-time Robust Principal Components' Pursuit , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[26]  Namrata Vaswani,et al.  Online matrix completion and online robust PCA , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[27]  Ioannis Mitliagkas,et al.  Memory Limited, Streaming PCA , 2013, NIPS.