An algorithm for computing exact least-trimmed squares estimate of simple linear regression with constraints

Abstract The least-trimmed squares estimation (LTS) is a robust solution for regression problems. On the one hand, it can achieve any given breakdown value by setting a proper trimming fraction. On the other hand, it has n -consistency and asymptotic normality under some conditions. In addition, the LTS estimator is regression, scale, and affine equivariant. In practical regression problems, we often need to impose constraints on slopes. In this paper, we describe a stable algorithm to compute the exact LTS solution for simple linear regression with constraints on the slope parameter. Without constraints, the overall complexity of the algorithm is O (n 2 log n) in time and O( n 2 ) in storage. According to our numerical tests, constraints can reduce computing load substantially. In order to achieve stability, we design the algorithm in such a way that we can take advantage of well-developed sorting algorithms and softwares. We illustrate the algorithm by some examples.

[1]  Ileana Streinu,et al.  A Pseudo-Algorithmic Separation of Lines from Pseudo-Lines , 1995, Inf. Process. Lett..

[2]  David G. Luenberger,et al.  Linear and nonlinear programming , 1984 .

[3]  Michael L. Fredman,et al.  How Good is the Information Theory Bound in Sorting? , 1976, Theor. Comput. Sci..

[4]  J. A. Vísek,et al.  Sensitivity analysis of M-estimates , 1996 .

[5]  Roy E. Welsch,et al.  Efficient Computing of Regression Diagnostics , 1981 .

[6]  F. Hampel The Influence Curve and Its Role in Robust Estimation , 1974 .

[7]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[8]  O. Hössjer Exact computation of the least trimmed squares estimate in simple linear regression , 1995 .

[9]  P. Rousseeuw Least Median of Squares Regression , 1984 .

[10]  P. J. Huber Robust Estimation of a Location Parameter , 1964 .

[11]  F. Hampel A General Qualitative Definition of Robustness , 1971 .

[12]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[13]  Douglas M. Hawkins,et al.  The feasible set algorithm for least median of squares regression , 1993 .

[14]  T. Speed,et al.  An estimate of the crosstalk matrix in four‐dye fluorescence‐based DNA sequencing , 1999, Electrophoresis.

[15]  J. Steele,et al.  Time- and Space-Efficient Algorithms for Least Median of Squares Regression , 1987 .

[16]  M. Adams,et al.  Automated DNA sequencing and analysis. , 1994 .

[17]  C. Lawson,et al.  Solving least squares problems , 1976, Classics in applied mathematics.

[18]  Jan Ámos Víšek,et al.  On the diversity of estimates , 2000 .

[19]  R. Cook Detection of influential observation in linear regression , 2000 .