Estimation and Accuracy After Model Selection

Classical statistical theory ignores model selection in assessing estimation accuracy. Here we consider bootstrap methods for computing standard errors and confidence intervals that take model selection into account. The methodology involves bagging, also known as bootstrap smoothing, to tame the erratic discontinuities of selection-based estimators. A useful new formula for the accuracy of bagging then provides standard errors for the smoothed estimators. Two examples, nonparametric and parametric, are carried through in detail: a regression model where the choice of degree (linear, quadratic, cubic, …) is determined by the Cp criterion and a Lasso-based estimation problem.

[1]  J. Hájek,et al.  Asymptotic Normality of Simple Linear Rank Statistics Under Alternatives II , 1968 .

[2]  C. L. Mallows Some comments on C_p , 1973 .

[3]  C. L. Mallows Some Comments onCp , 1973 .

[4]  B. Efron Bootstrap Methods: Another Look at the Jackknife , 1979 .

[5]  B. Efron,et al.  The Jackknife: The Bootstrap and Other Resampling Plans. , 1983 .

[6]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[7]  R. Stine Bootstrap Prediction Intervals for Regression , 1985 .

[8]  B. Efron Better Bootstrap Confidence Intervals , 1987 .

[9]  B. Efron The jackknife, the bootstrap, and other resampling plans , 1987 .

[10]  Bradley Efron Better Bootstrap Confidence Intervals: Rejoinder , 1987 .

[11]  Clifford M. Hurvich,et al.  The impact of model selection on inference in linear regression , 1990 .

[12]  Chih-Ling Tsai,et al.  Model selection for least absolute deviations regression in small samples , 1990 .

[13]  B. Efron,et al.  Compliance as an Explanatory Variable in Clinical Trials , 1991 .

[14]  P. Hall The Bootstrap and Edgeworth Expansion , 1992 .

[15]  B. Efron,et al.  More accurate confidence intervals in exponential families , 1992 .

[16]  L. Breiman The Little Bootstrap and other Methods for Dimensionality Selection in Regression: X-Fixed Prediction Error , 1992 .

[17]  B. Efron Jackknife‐After‐Bootstrap Standard Errors and Influence Functions , 1992 .

[18]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[19]  C. Mallows More comments on C p , 1995 .

[20]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[21]  R. Tibshirani,et al.  Using specially designed exponential families for density estimation , 1996 .

[22]  X. Sala-i-Martin,et al.  I Just Ran Two Million Regressions , 1997 .

[23]  K. Burnham,et al.  Model selection: An integral part of inference , 1997 .

[24]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[25]  A. Agresti,et al.  Approximate is Better than “Exact” for Interval Estimation of Binomial Proportions , 1998 .

[26]  M. Phillips,et al.  Observational Evidence from Supernovae for an Accelerating Universe and a Cosmological Constant , 1998, astro-ph/9805201.

[27]  R. Ellis,et al.  Measurements of $\Omega$ and $\Lambda$ from 42 high redshift supernovae , 1998, astro-ph/9812133.

[28]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[29]  Colin L. Mallows,et al.  Some Comments on Cp , 2000, Technometrics.

[30]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[31]  Nils Lid Hjort,et al.  Model Selection and Model Averaging , 2001 .

[32]  P. Bühlmann,et al.  Analyzing Bagging , 2001 .

[33]  Comparison of bootstrap and jackknife variance estimators in linear regression: second order results , 2002 .

[34]  N. Hjort,et al.  Frequentist Model Average Estimators , 2003 .

[35]  N. Hjort,et al.  The Focused Information Criterion , 2003 .

[36]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[37]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[38]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[39]  A. Buja,et al.  OBSERVATIONS ON BAGGING , 2006 .

[40]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[41]  Runze Li,et al.  Tuning parameter selectors for the smoothly clipped absolute deviation method. , 2007, Biometrika.

[42]  M. G. Pittau,et al.  A weakly informative default prior distribution for logistic and other regression models , 2008, 0901.4011.

[43]  Peter Hall,et al.  BOOTSTRAP-BASED PENALTY CHOICE FOR THE LASSO , ACHIEVING ORACLE PERFORMANCE , 2009 .

[44]  Joseph Sexton,et al.  Standard errors for bagged and random forest estimators , 2009, Comput. Stat. Data Anal..

[45]  Dimitris N. Politis,et al.  Model-free model-fitting and predictive distributions , 2010 .

[46]  S. Lahiri,et al.  Bootstrapping Lasso Estimators , 2011 .

[47]  Paul Fearnhead,et al.  Constructing summary statistics for approximate Bayesian computation: semi‐automatic approximate Bayesian computation , 2012 .

[48]  J. Møller Discussion on the paper by Feranhead and Prangle , 2012 .

[49]  Bradley Efron,et al.  Bayesian inference and the parametric bootstrap. , 2012, The annals of applied statistics.

[50]  A. Buja,et al.  Valid post-selection inference , 2013, 1306.1059.

[51]  Aki Vehtari,et al.  Understanding predictive information criteria for Bayesian models , 2013, Statistics and Computing.

[52]  Trevor J. Hastie,et al.  Confidence intervals for random forests: the jackknife and the infinitesimal jackknife , 2013, J. Mach. Learn. Res..

[53]  B. Efron Frequentist accuracy of Bayesian estimates , 2015, Journal of the Royal Statistical Society. Series B, Statistical methodology.