Statistica Sinica Preprint No : SS-2016-0546 R 1 Title Calibrated Percentile Double Bootstrap For Robust Linear Regression Inference

We consider inference for the parameters of a linear model when the covariates are random and the relationship between response and covariates is possibly non-linear. Conventional inference methods such as z-intervals perform poorly in these cases. We propose a double bootstrap-based calibrated percentile method, perc-cal, as a general-purpose CI method which performs very well relative to alternative methods in challenging situations such as these. The superior performance of perc-cal is demonstrated by a thorough, full-factorial design synthetic data study as well as a real data example involving the length of criminal sentences. We also provide theoretical justification for the perc-cal method under mild conditions. The method is implemented in the R package `perccal', available through CRAN and coded primarily in C++, to make it easier for practitioners to use.

[1]  Kai Zhang,et al.  Models as Approximations I: Consequences Illustrated with Linear Regression , 2014, Statistical Science.

[2]  P. Hall,et al.  Confidence bands in non‐parametric errors‐in‐variables regression , 2015 .

[3]  Andreas Buja,et al.  Models as Approximations - A Conspiracy of Random Regressors and Model Deviations Against Classical Inference in Regression , 2015 .

[4]  Vladimir Spokoiny,et al.  Bootstrap confidence sets under model misspecification , 2014, 1410.0347.

[5]  Xiaohong Chen,et al.  Recent advances and future directions in causality, prediction, and specification analysis , 2013 .

[6]  James G. MacKinnon,et al.  Thirty Years of Heteroskedasticity-Robust Inference , 2013 .

[7]  Dirk Eddelbuettel,et al.  Rcpp: Seamless R and C++ Integration , 2011 .

[8]  Patrick M. Kline,et al.  Higher Order Properties of the Wild Bootstrap Under Misspecification , 2011 .

[9]  Tatiene C. Souza,et al.  Inference Under Heteroskedasticity and Leveraged Data , 2007 .

[10]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[11]  F. W. Scholz,et al.  The Bootstrap Small Sample Properties , 2007 .

[12]  James G. M ac Kinnon Bootstrap Methods in Econometrics , 2006 .

[13]  Sílvia Gonçalves,et al.  Bootstrap Standard Error Estimates for Linear Regression , 2005 .

[14]  J. Fox Bootstrapping Regression Models , 2002 .

[15]  G. A. Young,et al.  The effect of Monte Carlo approximation on coverage error of double‐bootstrap confidence intervals , 1999 .

[16]  James G. Booth,et al.  Allocation of Monte Carlo Resources for the Iterated Bootstrap , 1998 .

[17]  E. Mammen The Bootstrap and Edgeworth Expansion , 1997 .

[18]  James G. Booth,et al.  Monte Carlo approximation and the iterated bootstrap , 1994 .

[19]  Michael A. Martin On the Double Bootstrap , 1992 .

[20]  Michael A. Martin On Bootstrap Iteration for Coverage Correction in Confidence Intervals , 1990 .

[21]  W. R. Schucany,et al.  Better nonparametric bootstrap confidence intervals for the correlation coefficient , 1989 .

[22]  J. L. Jensen Validity of the formal Edgeworth expansion when the underlying distribution is partly discrete , 1989 .

[23]  P. Hall,et al.  On bootstrap resampling and iteration , 1988 .

[24]  P. Hall Theoretical Comparison of Bootstrap Confidence Intervals , 1988 .

[25]  R. Beran Prepivoting Test Statistics: A Bootstrap View of Asymptotic Refinements , 1988 .

[26]  R. Beran Prepivoting to reduce level error of confidence sets , 1987 .

[27]  B. Efron Better Bootstrap Confidence Intervals , 1987 .

[28]  W. Loh,et al.  Calibrating Confidence Coefficients , 1987 .

[29]  P. Hall On the Bootstrap and Confidence Intervals , 1986 .

[30]  S. Portnoy On the central limit theorem in Rp when p→∞ , 1986 .

[31]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[32]  D. Freedman Bootstrapping Regression Models , 1981 .

[33]  B. Efron Nonparametric standard errors and confidence intervals , 1981 .

[34]  H. White A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity , 1980 .