High breakdown estimators for principal components: the projection-pursuit approach revisited

Li and Chen (J. Amer. Statist. Assoc. 80 (1985) 759) proposed a method for principal components using projection-pursuit techniques. In classical principal components one searches for directions with maximal variance, and their approach consists of replacing this variance by a robust scale measure. Li and Chen showed that this estimator is consistent, qualitative robust and inherits the breakdown point of the robust scale estimator. We complete their study by deriving the influence function of the estimators for the eigenvectors, eigenvalues and the associated dispersion matrix. Corresponding Gaussian efficiencies are presented as well. Asymptotic normality of the estimators has been treated in a paper of Cui et al. (Biometrika 90 (2003) 953), complementing the results of this paper. Furthermore, a simple explicit version of the projection-pursuit based estimator is proposed and shown to be fast to compute, orthogonally equivariant, and having the maximal finite-sample breakdown point property. We will illustrate the method with a real data example.

[1]  S. J. Devlin,et al.  Robust Estimation of Dispersion Matrices and Principal Components , 1981 .

[2]  D. G. Simpson,et al.  Robust principal component analysis for functional data , 2007 .

[3]  W. Härdle,et al.  Robust and Nonlinear Time Series Analysis , 1984 .

[4]  L. Ammann Robust Principal Components , 1989 .

[5]  P. Diaconis,et al.  Computer-Intensive Methods in Statistics , 1983 .

[6]  E. Ziegel COMPSTAT: Proceedings in Computational Statistics , 1988 .

[7]  L. Ammann Robust Singular Value Decompositions: A New Approach to Projection Pursuit , 1993 .

[8]  Elaine B. Martin,et al.  On principal component analysis in L 1 , 2002 .

[9]  Peter J. Rousseeuw,et al.  Asymptotics of Generalized S-Estimators , 1994 .

[10]  W. W. Daniel Applied Nonparametric Statistics , 1979 .

[11]  Christophe Croux,et al.  A Fast Algorithm for Robust Principal Components Based on Projection Pursuit , 1996 .

[12]  张健 ASYMPTOTIC THEORIES FOR THE ROBUST PP ESTIMATORS OF THE PRINCIPAL COMPONENTS AND DISPERSION MATRIX——III.BOOTSTRAP CONFIDENCE SETS,BOOTSTRAP TESTS , 1991 .

[13]  C. Croux,et al.  Generalizing univariate signed rank statistics for testing and estimating a multivariate location parameter , 1995 .

[14]  P. Rousseeuw Multivariate estimation with high breakdown point , 1985 .

[15]  Christophe Croux,et al.  Efficient high-breakdown M-estimators of scale☆ , 1994 .

[16]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[17]  Ola Hössjer,et al.  On the optimality of S-estimators☆ , 1992 .

[18]  Werner A. Stahel,et al.  Robust Statistics: The Approach Based on Influence Functions , 1987 .

[19]  C. Croux,et al.  Principal Component Analysis Based on Robust Estimators of the Covariance or Correlation Matrix: Influence Functions and Efficiencies , 2000 .

[20]  Ursula Gather,et al.  A Robustified Version of Sliced Inverse Regression , 2001 .

[21]  Guoying Li,et al.  Projection-Pursuit Approach to Robust Dispersion Matrices and Principal Components: Primary Theory and Monte Carlo , 1985 .

[22]  Yu-Long Xie,et al.  Robust principal component analysis by projection pursuit , 1993 .

[23]  Graciela Boente,et al.  A Robust Approach to Common Principal Components , 2001 .

[24]  I. Jolliffe Principal Component Analysis , 2002 .

[25]  P. L. Davies,et al.  Asymptotic behaviour of S-estimates of multivariate location parameters and dispersion matrices , 1987 .

[26]  Gérard Antille,et al.  Stability of robust and non-robust principal components analysis , 1990 .

[27]  Douglas G. Simpson,et al.  Robust Direction Estimation , 1992 .

[28]  Hengjian Cui,et al.  Asymptotic distributions of principal components based on robust dispersions , 2003 .

[29]  F. Critchley Influence in principal components analysis , 1985 .

[30]  G. C. McDonald,et al.  Instabilities of Regression Estimates Relating Air Pollution to Mortality , 1973 .

[31]  Georg Ch. Pflug,et al.  Mathematical statistics and applications , 1985 .

[32]  Peter J. Rousseeuw,et al.  ROBUST REGRESSION BY MEANS OF S-ESTIMATORS , 1984 .

[33]  Werner A. Stahel,et al.  Statistics in Genetics and in the Environmental Sciences , 2001, Entropy.

[34]  Mia Hubert,et al.  An improved algorithm for robust PCA , 2000 .

[35]  P. Rousseeuw,et al.  Alternatives to the Median Absolute Deviation , 1993 .

[36]  Ana M. Pires,et al.  Influence functions and outlier detection under the common principal components model: A robust approach , 2002 .

[37]  P. Rousseeuw,et al.  Generalized S-Estimators , 1994 .

[38]  P. Rousseeuw,et al.  A fast algorithm for the minimum covariance determinant estimator , 1999 .

[39]  P. Filzmoser Robust principal component and factor analysis in the geostatistical treatment of environmental data , 1999 .