New algorithms for computing the least trimmed squares regression estimator

The outlier detection in multiple linear regression is a difficult problem because of the masking effect. A procedure that works successfully uses residuals based on a high breakdown estimator. The least trimmed squares (LTS) estimator, which was proposed by Rousseeuw (J. Amer. Statist. Assoc. 79 (1984)), is a high breakdown estimator. In this paper we propose two algorithms to compute the LTS estimator. The first algorithm is probabilistic and is based on an exchange procedure. The second algorithm is exact and based on a branch-and-bound technique that guarantees global optimality without exhaustive evaluation. We discuss the implementation of these algorithms using orthogonal decomposition procedures and propose several accelerations. The application of the new algorithms to real and simulated data sets shows that they significantly reduce the computational cost with respect to the algorithms previously described in the literature.

[1]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[2]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[3]  Douglas M. Hawkins,et al.  The feasible solution algorithm for least trimmed squares regression , 1994 .

[4]  Peter J. Rousseeuw,et al.  A fast algorithm for highly robust regression in data mining , 2000 .

[5]  P. Rousseeuw Least Median of Squares Regression , 1984 .

[6]  Dankmar Böhning,et al.  The lower bound method in probit regression , 1999 .

[7]  W. Morven Gentleman,et al.  Basic Procedures for Large, Sparse or Weighted Linear Least Squares Problems , 1974 .

[8]  Douglas M. Hawkins,et al.  High Breakdown Regression and Multivariate Estimation , 1993 .

[9]  José Julio Espina Agulló Speeding up the Computation of the Least Quartile Difference Estimator , 1998, COMPSTAT.

[10]  O. Hössjer Exact computation of the least trimmed squares estimate in simple linear regression , 1995 .

[11]  R. W. Farebrother Remark AS R17: Recursive Residuals-A Remark on Algorithm AS 75: Basic Procedures for Large, Sparse or Weighted Linear Least Squares Problems , 1976 .

[12]  Anthony C. Atkinson,et al.  Simulated Annealing for the detection of Multiple Outliers using least squares and least median of squares fittin , 1991 .

[13]  A. Stromberg,et al.  The Least Trimmed Differences Regression Estimator and Alternatives , 2000 .

[14]  Alan J. Miller,et al.  Least Squares Routines to Supplement Those of Gentleman , 1992 .

[15]  G. V. Kass,et al.  Location of Several Outliers in Multiple-Regression Data Using Elemental Sets , 1984 .

[16]  P. Rousseeuw,et al.  Generalized S-Estimators , 1994 .

[17]  D. Rubinfeld,et al.  Hedonic housing prices and the demand for clean air , 1978 .