Robust PCA in High-dimension: A Deterministic Approach

We consider principal component analysis for contaminated data-set in the high dimensional regime, where the dimensionality of each observation is comparable or even more than the number of observations. We propose a deterministic high-dimensional robust PCA algorithm which inherits all theoretical properties of its randomized counterpart, i.e., it is tractable, robust to contaminated points, easily kernelizable, asymptotic consistent and achieves maximal robustness - a breakdown point of 50%. More importantly, the proposed method exhibits significantly better computational efficiency, which makes it suitable for large-scale real applications.

[1]  Alexandre d'Aspremont,et al.  Optimal Solutions for Sparse Principal Component Analysis , 2007, J. Mach. Learn. Res..

[2]  M. Hubert,et al.  A fast method for robust principal components with applications to chemometrics , 2002 .

[3]  R. Maronna Robust $M$-Estimators of Multivariate Location and Scatter , 1976 .

[4]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[5]  P. Rousseeuw Least Median of Squares Regression , 1984 .

[6]  Gregory Piatetsky-Shapiro,et al.  High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality , 2000 .

[7]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[8]  Guoying Li,et al.  Projection-Pursuit Approach to Robust Dispersion Matrices and Principal Components: Primary Theory and Monte Carlo , 1985 .

[9]  Constantine Caramanis,et al.  Robust PCA via Outlier Pursuit , 2010, IEEE Transactions on Information Theory.

[10]  Mia Hubert,et al.  ROBPCA: A New Approach to Robust Principal Component Analysis , 2005, Technometrics.

[11]  Shie Mannor,et al.  Principal Component Analysis with Contaminated Data: The High Dimensional Case , 2010, COLT 2010.

[12]  Christophe Croux,et al.  High breakdown estimators for principal components: the projection-pursuit approach revisited , 2005 .

[13]  Christophe Croux,et al.  A Fast Algorithm for Robust Principal Components Based on Projection Pursuit , 1996 .

[14]  Peter J. Rousseeuw,et al.  Robust Regression and Outlier Detection , 2005, Wiley Series in Probability and Statistics.

[15]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .