The feasible set algorithm for least median of squares regression

Abstract The Least Median of Squares (LMS) criterion is a current standard method of analysis of data when the possibility of severe badly-placed outliers makes an estimate with high breakdown point desirable. Sometimes the LMS criterion is used in its own right, and sometimes it is the starting point for other follow-up analyses. Difficulties have arisen in its use, however, in that until recently there was no known way to obtain an exact LMS fit to a data set with more than one predictor. This has confused the discussion of LMS, since there is no way of knowing to what extent particular features seen in analysis really are properties of the LMS estimator and to what extent they are manifestations of the fact that the computed LMS fits are only approximations (and of unknown quality) to the exact solution. A recent algorithm by Stromberg has alleviated this difficulty by providing the mechanism for obtaining an exact fit. Unfortunately this approach is computationally intractable for all but quite small problems. The present paper proposes a probabilistic algorithm called the ‘Feasible Set Algorithm’ which produces only trial values satisfying the necessary condition for the optimum and which provides the exact solution with probability 1 as the number of iterations increases. The method's good performance on real data sets is verified by example.

[1]  B. Joe,et al.  An Exact Penalty Method for Constrained, Discrete, Linear $l_\infty $ Data Fitting , 1983 .

[2]  D. G. Simpson,et al.  Unmasking Multivariate Outliers and Leverage Points: Comment , 1990 .

[3]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[4]  Ian Barrodale,et al.  Algorithm 495: Solution of an Overdetermined System of Linear Equations in the Chebychev Norm [F4] , 1975, TOMS.

[5]  Douglas M. Hawkins,et al.  High Breakdown Regression and Multivariate Estimation , 1993 .

[6]  Gilbert W. Bassett Equivariant, Monotonic, 50% Breakdown Estimators , 1991 .

[7]  S. Sheather,et al.  A Cautionary Note on the Method of Least Median Squares , 1992 .

[8]  D. Rubinfeld,et al.  Hedonic housing prices and the demand for clean air , 1978 .

[9]  P. Rousseeuw,et al.  Unmasking Multivariate Outliers and Leverage Points , 1990 .

[10]  G. V. Kass,et al.  Location of Several Outliers in Multiple-Regression Data Using Elemental Sets , 1984 .

[11]  Alfio Marazzi,et al.  Probabilistic algorithms for least median of squares regression , 1989 .

[12]  S. Portnoy Regression Quantile Diagnostics for Multiple Outliers , 1991 .

[13]  J. Steele,et al.  Time- and Space-Efficient Algorithms for Least Median of Squares Regression , 1987 .

[14]  Peter J. Rousseeuw,et al.  Robustness of the p-Subset Algorithm for Regression with High Breakdown Point , 1991 .

[15]  J.M. Steele,et al.  Algorithms and complexity for least median of squares regression , 1986, Discret. Appl. Math..

[16]  Anthony C. Atkinson,et al.  Simulated Annealing for the detection of Multiple Outliers using least squares and least median of squares fittin , 1991 .

[17]  Roger Koenker,et al.  An Empirical Quantile Function for Linear Models with | operatornameiid Errors , 1982 .

[18]  E. Cheney Introduction to approximation theory , 1966 .

[19]  R. Streit Solution of systems of complex linear equations in the l ∞ 0E norm with constraints on the unknowns , 1986 .

[20]  B. Bowerman Statistical Design and Analysis of Experiments, with Applications to Engineering and Science , 1989 .

[21]  V. A. Sposito,et al.  Using the least squares estimator in Chebyshev estimation , 1980 .

[22]  Alfio Marazzi Algorithms and programs for robust linear regression , 1991 .

[23]  P. Rousseeuw Least Median of Squares Regression , 1984 .