Cross-validation in nonparametric regression with outliers

A popular data-driven method for choosing the bandwidth in standard kernel regression is cross-validation. Even when there are outliers in the data, robust kernel regression can be used to estimate the unknown regression curve [Robust and Nonlinear Time Series Analysis. Lecture Notes in Statist. (1984) 26 163-184]. However, under these circumstances standard cross-validation is no longer a satisfactory bandwidth selector because it is unduly influenced by extreme prediction errors caused by the existence of these outliers. A more robust method proposed here is a cross-validation method that discounts the extreme prediction errors. In large samples the robust method chooses consistent bandwidths, and the consistency of the method is practically independent of the form in which extreme prediction errors are discounted. Additionally, evaluation of the method's finite sample behavior in a simulation demonstrates that the proposed method performs favorably. This method can also he applied to other problems, for example, model selection, that require cross-validation.

[1]  H. Müller,et al.  Kernel estimation of regression functions , 1979 .

[2]  James Stephen Marron,et al.  Regression smoothing parameters that are not far from their optimum , 1992 .

[3]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[4]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[5]  P. Whittle,et al.  Bounds for the Moments of Linear and Quadratic Forms in Independent Variables , 1960 .

[6]  D. Cox Asymptotics for $M$-Type Smoothing Splines , 1983 .

[7]  W. Härdle,et al.  How Far are Automatically Chosen Regression Smoothing Parameters from their Optimum , 1988 .

[8]  W. Härdle Applied Nonparametric Regression , 1992 .

[9]  D. W. Scott,et al.  The L 1 Method for Robust Nonparametric Regression , 1994 .

[10]  J. Rice Bandwidth Choice for Nonparametric Regression , 1984 .

[11]  Wolfgang Härdle Resistant Smoothing Using the Fast Fourier Transform , 1987 .

[12]  Elvezio Ronchetti,et al.  Robust Linear Model Selection by Cross-Validation , 1997 .

[13]  G. S. Watson,et al.  Smooth regression analysis , 1964 .

[14]  E. Nadaraya On Estimating Regression , 1964 .

[15]  Francis H.C Marriott,et al.  Bandwidth selection in robust smoothing , 1993 .

[16]  Jean Meloche,et al.  Robust plug-in bandwidth estimators in nonparametric regression , 1997 .

[17]  W. Härdle,et al.  Uniform Consistency of a Class of Regression Function Estimators , 1984 .

[18]  H. Akaike A new look at the statistical model identification , 1974 .

[19]  James Stephen Marron,et al.  Comparison of data-driven bandwith selectors , 1988 .

[20]  Wolfgang Härdle How to determine the bandwidth of some nonlinear smoothers in practice , 1984 .

[21]  W. Härdle,et al.  Robust Non-parametric Function Fitting , 1984 .

[22]  R. Shibata An optimal selection of regression variables , 1981 .

[23]  M. Priestley,et al.  Non‐Parametric Function Fitting , 1972 .