论文信息 - Estimation and Accuracy After Model Selection - 字舞流文

Estimation and Accuracy After Model Selection

Classical statistical theory ignores model selection in assessing estimation accuracy. Here we consider bootstrap methods for computing standard errors and confidence intervals that take model selection into account. The methodology involves bagging, also known as bootstrap smoothing, to tame the erratic discontinuities of selection-based estimators. A useful new formula for the accuracy of bagging then provides standard errors for the smoothed estimators. Two examples, nonparametric and parametric, are carried through in detail: a regression model where the choice of degree (linear, quadratic, cubic, …) is determined by the Cp criterion and a Lasso-based estimation problem.

[1] J. Hájek,et al. Asymptotic Normality of Simple Linear Rank Statistics Under Alternatives II , 1968 .

[2] C. L. Mallows. Some comments on C_p , 1973 .

[3] C. L. Mallows. Some Comments onCp , 1973 .

[4] B. Efron. Bootstrap Methods: Another Look at the Jackknife , 1979 .

[5] B. Efron,et al. The Jackknife: The Bootstrap and Other Resampling Plans. , 1983 .

[6] B. Efron. Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[7] R. Stine. Bootstrap Prediction Intervals for Regression , 1985 .

[8] B. Efron. Better Bootstrap Confidence Intervals , 1987 .

[9] B. Efron. The jackknife, the bootstrap, and other resampling plans , 1987 .

[10] Bradley Efron. Better Bootstrap Confidence Intervals: Rejoinder , 1987 .

[11] Clifford M. Hurvich,et al. The impact of model selection on inference in linear regression , 1990 .

[12] Chih-Ling Tsai,et al. Model selection for least absolute deviations regression in small samples , 1990 .

[13] B. Efron,et al. Compliance as an Explanatory Variable in Clinical Trials , 1991 .

[14] P. Hall. The Bootstrap and Edgeworth Expansion , 1992 .

[15] B. Efron,et al. More accurate confidence intervals in exponential families , 1992 .

[16] L. Breiman. The Little Bootstrap and other Methods for Dimensionality Selection in Regression: X-Fixed Prediction Error , 1992 .

[17] B. Efron. Jackknife‐After‐Bootstrap Standard Errors and Influence Functions , 1992 .

[18] R. Tibshirani,et al. An introduction to the bootstrap , 1993 .

[19] C. Mallows. More comments on C p , 1995 .

[20] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[21] R. Tibshirani,et al. Using specially designed exponential families for density estimation , 1996 .

[22] X. Sala-i-Martin,et al. I Just Ran Two Million Regressions , 1997 .

[23] K. Burnham,et al. Model selection: An integral part of inference , 1997 .

[24] R. Tibshirani,et al. Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[25] A. Agresti,et al. Approximate is Better than “Exact” for Interval Estimation of Binomial Proportions , 1998 .

[26] M. Phillips,et al. Observational Evidence from Supernovae for an Accelerating Universe and a Cosmological Constant , 1998, astro-ph/9805201.

[27] R. Ellis,et al. Measurements of $\Omega$ and $\Lambda$ from 42 high redshift supernovae , 1998, astro-ph/9812133.

[28] Wenjiang J. Fu,et al. Asymptotics for lasso-type estimators , 2000 .

[29] Colin L. Mallows,et al. Some Comments on Cp , 2000, Technometrics.

[30] Jianqing Fan,et al. Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[31] Nils Lid Hjort,et al. Model Selection and Model Averaging , 2001 .

[32] P. Bühlmann,et al. Analyzing Bagging , 2001 .

[33] Comparison of bootstrap and jackknife variance estimators in linear regression: second order results , 2002 .

[34] N. Hjort,et al. Frequentist Model Average Estimators , 2003 .

[35] N. Hjort,et al. The Focused Information Criterion , 2003 .

[36] R. Tibshirani,et al. Least angle regression , 2004, math/0406456.

[37] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.

[38] H. Zou. The Adaptive Lasso and Its Oracle Properties , 2006 .

[39] A. Buja,et al. OBSERVATIONS ON BAGGING , 2006 .

[40] Susan A. Murphy,et al. Monographs on statistics and applied probability , 1990 .

[41] Runze Li,et al. Tuning parameter selectors for the smoothly clipped absolute deviation method. , 2007, Biometrika.

[42] M. G. Pittau,et al. A weakly informative default prior distribution for logistic and other regression models , 2008, 0901.4011.

[43] Peter Hall,et al. BOOTSTRAP-BASED PENALTY CHOICE FOR THE LASSO , ACHIEVING ORACLE PERFORMANCE , 2009 .

[44] Joseph Sexton,et al. Standard errors for bagged and random forest estimators , 2009, Comput. Stat. Data Anal..

[45] Dimitris N. Politis,et al. Model-free model-fitting and predictive distributions , 2010 .

[46] S. Lahiri,et al. Bootstrapping Lasso Estimators , 2011 .

[47] Paul Fearnhead,et al. Constructing summary statistics for approximate Bayesian computation: semi‐automatic approximate Bayesian computation , 2012 .

[48] J. Møller. Discussion on the paper by Feranhead and Prangle , 2012 .

[49] Bradley Efron,et al. Bayesian inference and the parametric bootstrap. , 2012, The annals of applied statistics.

[50] A. Buja,et al. Valid post-selection inference , 2013, 1306.1059.

[51] Aki Vehtari,et al. Understanding predictive information criteria for Bayesian models , 2013, Statistics and Computing.

[52] Trevor J. Hastie,et al. Confidence intervals for random forests: the jackknife and the infinitesimal jackknife , 2013, J. Mach. Learn. Res..

[53] B. Efron. Frequentist accuracy of Bayesian estimates , 2015, Journal of the Royal Statistical Society. Series B, Statistical methodology.