Principal Components and Orthogonal Regression Based on Robust Scales

Both principal components analysis (PCA) and orthogonal regression deal with finding a p-dimensional linear manifold minimizing a scale of the orthogonal distances of the m-dimensional data points to the manifold. The main conceptual difference is that in PCA p is estimated from the data, to attain a small proportion of unexplained variability, whereas in orthogonal regression p equals m − 1. The two main approaches to robust PCA are using the eigenvectors of a robust covariance matrix and searching for the projections that maximize or minimize a robust (univariate) dispersion measure. This article is more akin to second approach. But rather than finding the components one by one, we directly undertake the problem of finding, for a given p, a p-dimensional linear manifold minimizing a robust scale of the orthogonal distances of the data points to the manifold. The scale may be either a smooth M-scale or a “trimmed” scale. An iterative algorithm is developed that is shown to converge to a local minimum. A strategy based on random search is used to approximate a global minimum. The procedure is shown to be faster than other high-breakdown-point competitors, especially for large m. The case whereas p = m − 1 yields orthogonal regression. For PCA, a computationally efficient method to choose p is given. Comparisons based on both simulated and real data show that the proposed procedure is more robust than its competitors.

[1]  P. Rousseeuw Multivariate estimation with high breakdown point , 1985 .

[2]  Ruben H. Zamar,et al.  Robust Estimates of Location and Dispersion for High-Dimensional Datasets , 2002, Technometrics.

[3]  S. J. Devlin,et al.  Robust Estimation of Dispersion Matrices and Principal Components , 1981 .

[4]  Michael L. Brown Robust Line Estimation with Errors in Both Variables , 1975 .

[5]  A. Satorra,et al.  Measurement Error Models , 1988 .

[6]  Bell Telephone,et al.  ROBUST ESTIMATES, RESIDUALS, AND OUTLIER DETECTION WITH MULTIRESPONSE DATA , 1972 .

[7]  Graciela Boente,et al.  A Robust Approach to Common Principal Components , 2001 .

[8]  Gérard Antille,et al.  Stability of robust and non-robust principal components analysis , 1990 .

[9]  Christophe Croux,et al.  High breakdown estimators for principal components: the projection-pursuit approach revisited , 2005 .

[10]  G. Boente Asymptotic theory for robust principal components , 1987 .

[11]  Katrien van Driessen,et al.  A Fast Algorithm for the Minimum Covariance Determinant Estimator , 1999, Technometrics.

[12]  Guoying Li,et al.  Projection-Pursuit Approach to Robust Dispersion Matrices and Principal Components: Primary Theory and Monte Carlo , 1985 .

[13]  R. Maronna Robust $M$-Estimators of Multivariate Location and Scatter , 1976 .

[14]  P. Filzmoser Robust principal component and factor analysis in the geostatistical treatment of environmental data , 1999 .

[15]  Kevin Baker,et al.  Classification of radar returns from the ionosphere using neural networks , 1989 .

[16]  Peter Filzmoser,et al.  Robust Principal Component Analysis by Projection Pursuit , 2006 .

[17]  Yu-Long Xie,et al.  Robust principal component analysis by projection pursuit , 1993 .

[18]  P. L. Davies,et al.  Asymptotic behaviour of S-estimates of multivariate location parameters and dispersion matrices , 1987 .

[19]  D. G. Simpson,et al.  Robust principal component analysis for functional data , 2007 .

[20]  N. Campbell Robust Procedures in Multivariate Analysis I: Robust Covariance Estimation , 1980 .

[21]  Christophe Croux,et al.  A Fast Algorithm for Robust Principal Components Based on Projection Pursuit , 1996 .

[22]  R. Zamar Robust estimation in the errors-in-variables model , 1989 .

[23]  Ruben H. Zamar,et al.  Bias Robust Estimation in Orthogonal Regression , 1992 .

[24]  C. Croux,et al.  Principal Component Analysis Based on Robust Estimators of the Covariance or Correlation Matrix: Influence Functions and Efficiencies , 2000 .

[25]  Ursula Gather,et al.  Robust sliced inverse regression procedures , 1998 .

[26]  Mia Hubert,et al.  Robust PCA for High-dimensional Data , 2003 .

[27]  R. Carroll,et al.  Sow aspects of robustness in thr functional errors-in-variables regression model , 1982 .