Chapter 10 – Robust Regression

Chapter 10 summarizes a wide range of robust regression estimators. Their relative merits are discussed. Generally, these estimators deal effectively with regression outliers and leverage points. Some can offer a substantial advantage, in terms of efficiency, when there is heteroscedasticity. Included are robust versions of logistic regression and recently derived methods for dealing with multivariate regression, two of which take into account the association among the outcome variables, in contrast to most estimators that have been proposed. R functions for applying these estimators are described.

[1]  R. Wilcox Simulation results on extensions of the theil-sen regression estimator , 1998 .

[2]  Victor J. Yohai,et al.  Correcting MM estimates for "fat" data sets , 2010, Comput. Stat. Data Anal..

[3]  Chenlei Leng,et al.  An empirical likelihood approach to quantile regression with auxiliary information , 2012 .

[4]  Roger Koenker,et al.  Inference on the Quantile Regression Process , 2000 .

[5]  Andreas Christmann,et al.  Least median of weighted squares in logistic regression with large strata , 1994 .

[6]  D. Bertsimas,et al.  Least quantile regression via modern optimization , 2013, 1310.8625.

[7]  Siddhartha R. Dalal,et al.  Robust, smoothly heterogeneous variance regression , 1993 .

[8]  David J. Olive,et al.  Inconsistency of Resampling Algorithms for High-Breakdown Regression Estimators and a New Algorithm , 2002 .

[9]  Ola Hössjer,et al.  On the optimality of S-estimators☆ , 1992 .

[10]  P. Talwar,et al.  A simulation study of some non-parametric regression estimators , 1993 .

[11]  Elena Martínez,et al.  Robust testing in the logistic regression model , 2009, Comput. Stat. Data Anal..

[12]  C. W. Coakley,et al.  A Bounded Influence, High Breakdown, Efficient Regression Estimator , 1993 .

[13]  Peter Filzmoser,et al.  The least trimmed quantile regression , 2012, Comput. Stat. Data Anal..

[14]  Multiple Hypothesis Testing Based on the Ordinary Least Squares Regression Estimator when there is Heteroscedasticity , 2003 .

[15]  Nedret Billor,et al.  A Comparison of Multiple Outlier Detection Methods for Regression Data , 2008, Commun. Stat. Simul. Comput..

[16]  Francisco Cribari-Neto,et al.  Asymptotic inference under heteroskedasticity of unknown form , 2004, Comput. Stat. Data Anal..

[17]  A. Klockars,et al.  Traditional and proposed tests of slope homogeneity for non-normal and heteroscedastic data. , 2012, The British journal of mathematical and statistical psychology.

[18]  G. L. Sievers Weighted Rank Statistics for Simple Linear Regression , 1978 .

[19]  Joseph W. McKean,et al.  Rank-Based Analysis of the Heteroscedastic Linear Model , 1996 .

[20]  Keith Knight,et al.  Limiting distributions for $L\sb 1$ regression estimators under general conditions , 1998 .

[21]  R. Maronna Robust $M$-Estimators of Multivariate Location and Scatter , 1976 .

[22]  Confidence intervals for the slope of a regression line when the error term has nonconstant variance , 1996 .

[23]  D. Lax Robust Estimators of Scale: Finite-Sample Performance in Long-Tailed Symmetric Distributions , 1985 .

[24]  Howard D. Bondell,et al.  Minimum distance estimation for the logistic regression model , 2005 .

[25]  D. Pregibon Resistant fits for some commonly used logistic models with medical application. , 1982, Biometrics.

[26]  R. Wilcox Estimation in the simple linear regression model when there is heteroscedasticity of unknown form , 1996 .

[27]  Howard D Bondell,et al.  Efficient Robust Regression via Two-Stage Generalized Empirical Likelihood , 2013, Journal of the American Statistical Association.

[28]  Ricardo A. Maronna,et al.  Robust Ridge Regression for High-Dimensional Data , 2011, Technometrics.

[29]  R. Wilcox,et al.  A comparison of two-stage procedures for testing least-squares coefficients under heteroscedasticity. , 2011, The British journal of mathematical and statistical psychology.

[30]  R. Wilcox A Note on the Theil-Sen Regression Estimator When the Regressor Is Random and the Error Term Is Heteroscedastic , 1998 .

[31]  R. Welsch,et al.  Efficient Bounded-Influence Regression Estimation , 1982 .

[32]  L. S. Nelson,et al.  The Nelder-Mead Simplex Procedure for Function Minimization , 1975 .

[33]  G. Willems,et al.  Small sample corrections for LTS and MCD , 2002 .

[34]  E. Jacquelin Dietz,et al.  A comparison of robust estimators in simple linear regression , 1987 .

[35]  C. Croux,et al.  The breakdown behavior of the maximum likelihood estimator in the logistic regression model , 2002 .

[36]  Werner A. Stahel,et al.  Sharpening Wald-type inference in robust regression for small samples , 2011, Comput. Stat. Data Anal..

[37]  S. Morgenthaler Least-Absolute-Deviations Fits for Generalized Linear Models , 1992 .

[38]  D. G. Simpson,et al.  Breakdown robustness of tests , 1990 .

[39]  E. Handschin,et al.  Bad data analysis for power system state estimation , 1975, IEEE Transactions on Power Apparatus and Systems.

[40]  Louis A. Jaeckel Estimating Regression Coefficients by Minimizing the Dispersion of the Residuals , 1972 .

[41]  Stefan Van Aelst,et al.  Robust Multivariate Regression , 2004, Technometrics.

[42]  Salvador Flores On the efficient computation of robust regression estimators , 2010, Comput. Stat. Data Anal..

[43]  Steven Roberts,et al.  An Adaptive, Automatic Multiple-Case Deletion Technique for Detecting Influence in Regression , 2015, Technometrics.

[44]  H. Bondell A characteristic function approach to the biased sampling model, with application to robust logistic regression , 2008 .

[45]  Weihua Zhou,et al.  Robust Estimation of Multivariate Linear Model Based on Depth Weighted Mean and Scatter , 2009, Commun. Stat. Simul. Comput..

[46]  P. L. Davies Aspects of Robust Linear Regression , 1993 .

[47]  Joseph W. McKean,et al.  Rank-Based Estimation and Associated Inferences for Linear Models With Cluster Correlated Errors , 2009 .

[48]  Christophe Croux,et al.  Implementing the Bianco and Yohai estimator for logistic regression , 2003, Comput. Stat. Data Anal..

[49]  Asheber Abebe,et al.  On the Iteratively Reweighted Rank Regression Estimator , 2012, Commun. Stat. Simul. Comput..

[50]  V. Yohai,et al.  High Breakdown-Point Estimates of Regression by Means of the Minimization of an Efficient Scale , 1988 .

[51]  Simon J. Sheather,et al.  High-Breakdown Rank Regression , 1999 .

[52]  H. White,et al.  Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties☆ , 1985 .

[53]  Deo Kumar Srivastava,et al.  Robust Winsorized Regression Using Bootstrap Approach , 2009, Commun. Stat. Simul. Comput..

[54]  Douglas M. Hawkins,et al.  High Breakdown Regression and Multivariate Estimation , 1993 .

[55]  Thomas P. Hettmansperger,et al.  A Robust Alternative Based on Ranks to Least Squares in Analyzing Linear Models , 1977 .

[56]  L. G. Godfrey,et al.  Tests for regression models with heteroskedasticity of unknown form , 2006, Comput. Stat. Data Anal..

[57]  E. Mammen Bootstrap and Wild Bootstrap for High Dimensional Linear Models , 1993 .

[58]  Changbao Wu,et al.  Jackknife, Bootstrap and Other Resampling Methods in Regression Analysis , 1986 .

[59]  R. Wilcox Some results on extensions and modifications of the Theil-Sen regression estimator. , 2004, The British journal of mathematical and statistical psychology.

[60]  Z. Bai,et al.  Asymptotic distributions of the maximal depth estimators for regression and multivariate location , 1999 .

[61]  Norman Cliff,et al.  Predicting ordinal relations , 1994 .

[62]  T. Hettmansperger,et al.  Robust Bounded Influence Tests in Linear Models , 1990 .

[63]  D. G. Simpson,et al.  Lower Bounds for Contamination Bias: Globally Minimax Versus Locally Linear Estimation , 1993 .

[64]  V. Yohai HIGH BREAKDOWN-POINT AND HIGH EFFICIENCY ROBUST ESTIMATES FOR REGRESSION , 1987 .

[65]  Simon J. Sheather,et al.  The Use and Interpretation of Residuals Based on Robust Estimation , 1993 .

[66]  P. Robinson Asymptotically efficient estimation in the presence of heteroskedasticity of unknown form , 1987 .

[67]  M. Nasser,et al.  Identification and classification of multiple outliers, high leverage points and influential observations in linear regression , 2015 .

[68]  Simon J. Sheather,et al.  Regression Diagnostics for Rank-Based Methods , 1990 .

[69]  V. Yohai,et al.  Robust estimation for the multivariate linear model based on a τ-scale , 2006 .

[70]  W. S. Krasker Estimation in Linear Regression Models with Disparate Data Points , 1980 .

[71]  Tatiene C. Souza,et al.  Inference Under Heteroskedasticity and Leveraged Data , 2007 .

[72]  P. Rousseeuw,et al.  Generalized S-Estimators , 1994 .

[73]  M. Tableman,et al.  Bounded-Influence Rank Regression: A One-Step Estimator Based on Wilcoxon Scores , 1990 .

[74]  Hengjian Cui,et al.  Influence function and maximum bias of projection depth based estimators , 2003 .

[75]  David Ruppert,et al.  Robust Estimation in Heteroscedastic Linear Models. , 1982 .

[76]  Stephen Portnoy,et al.  Reweighted LS Estimators Converge at the same Rate as the Initial Estimator , 1992 .

[77]  Roger Koenker,et al.  Tests of linear hypotheses based on regression rank scores , 1993 .

[78]  Francisco Cribari‐Neto,et al.  New heteroskedasticity-robust standard errors for the linear regression model , 2014 .

[79]  Rand R. Wilcox,et al.  Robust Multivariate Regression When There is Heteroscedasticity , 2008, Commun. Stat. Simul. Comput..

[80]  P. Rousseeuw,et al.  Unmasking Multivariate Outliers and Leverage Points , 1990 .

[81]  Jana Jurečková,et al.  Asymptotics for one-step m-estimators in regression with application to combining efficiency and high breakdown point , 1987 .

[82]  Hanxiang Peng,et al.  Consistency and asymptotic distribution of the Theil–Sen estimator , 2008 .

[83]  Myoungshic Jhun,et al.  Bootstrapping least distance estimator in the multivariate regression model , 2009, Comput. Stat. Data Anal..

[84]  V. Yohai,et al.  Min-Max Bias Robust Regression. , 1989 .

[85]  Alan H. Welsh,et al.  The Trimmed Mean in the Linear Model , 1987 .

[86]  Douglas M. Hawkins,et al.  Comparison of Model Misspecification Diagnostics Using Residuals from Least Mean of Squares and Least Median of Squares Fits , 1992 .

[87]  Nélida E. Ferretti,et al.  A Class of Locally and Globally Robust Regression Estimates , 1999 .

[88]  Xuming He,et al.  On the Stahel-Donoho estimator and depth-weighted means of multivariate data , 2003 .

[89]  E. Ronchetti,et al.  Robust Inference for Generalized Linear Models , 2001 .

[90]  The asymptotics of the least trimmed absolute deviations (LTAD) estimator , 1994 .

[91]  V. Yohai,et al.  Bias-Robust Estimates of Regression Based on Projections , 1993 .

[92]  C. Radhakrishna Rao,et al.  Asymptotic theory of least distances estimate in multivariate linear models , 1990 .

[93]  A. H. Welsh,et al.  One-step L-estimators for the linear model , 1987 .

[94]  Maria-Pia Victoria-Feser,et al.  Robust inference with binary data , 2001 .

[95]  Victor J. Yohai,et al.  High finite-sample efficiency and robustness based on distance-constrained maximum likelihood , 2013, Comput. Stat. Data Anal..

[96]  P. L. Davies,et al.  The asymptotics of S-estimators in the linear regression model , 1990 .

[97]  Jiin-Huarng Guo,et al.  Approximate transformation trimmed mean methods to the test of simple linear regression slope equality , 2000 .

[98]  P. Rousseeuw Least Median of Squares Regression , 1984 .

[99]  F. Hampel Robust estimation: A condensed partial survey , 1973 .

[100]  C. Gutenbrunner,et al.  Regression Rank Scores and Regression Quantiles , 1992 .

[101]  Mia Hubert,et al.  The Catline for Deep Regression , 1998 .

[102]  Douglas M. Hawkins,et al.  Applications and algorithms for least trimmed sum of absolute deviations regression , 1999 .

[103]  F. Scholz Weighted Median Regression Estimates , 1978 .

[104]  P. Sen Estimates of the Regression Coefficient Based on Kendall's Tau , 1968 .

[105]  V. Yohai,et al.  A class of robust and fully efficient regression estimators , 2002 .

[106]  Andreas Christmann,et al.  Robustness against separation and outliers in logistic regression , 2003, Comput. Stat. Data Anal..

[107]  Chih-Ling Tsai,et al.  A comparison of tests for heteroscedasticity , 1996 .

[108]  P. Sprent,et al.  Non-Parametric Regression , 1983 .

[109]  T. Hettmansperger,et al.  Robust analysis of variance based upon a likelihood ratio criterion , 1980 .

[110]  José Julio Espina Agulló,et al.  The multivariate least-trimmed squares estimator , 2008 .

[111]  Ke-Hai Yuan,et al.  Local Influence and Robust Procedures for Mediation Analysis , 2010, Multivariate behavioral research.