Confidence Intervals for the Generalisation Error of Random Forests

Out-of-bag error is commonly used as an estimate of generalisation error in ensemble-based learning models such as random forests. We present confidence intervals for this quantity using the delta-methodafter-bootstrap and the jackknife-after-bootstrap techniques. These methods do not require growing any additional trees. We show that these new confidence intervals have improved coverage properties over the näıve confidence interval, in real and simulated examples.

[1]  B. Efron Resampling Plans and the Estimation of Prediction Error , 2021 .

[2]  David M. Allen,et al.  The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction , 1974 .

[3]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[4]  Michael I. Jordan,et al.  A Swiss Army Infinitesimal Jackknife , 2018, AISTATS.

[5]  H. Akaike A new look at the statistical model identification , 1974 .

[6]  R. Tibshirani,et al.  Cross-Validation and the Bootstrap : Estimating the Error Rate ofa Prediction , 1995 .

[7]  S. Athey,et al.  Generalized random forests , 2016, The Annals of Statistics.

[8]  Chen Xu,et al.  Predictive inference is free with the jackknife+-after-bootstrap , 2020, NeurIPS.

[9]  C. Stein Estimation of the Mean of a Multivariate Normal Distribution , 1981 .

[10]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[11]  Seymour Geisser,et al.  The Predictive Sample Reuse Method with Applications , 1975 .

[12]  B. Efron Jackknife‐After‐Bootstrap Standard Errors and Influence Functions , 1992 .

[13]  B. Efron How Biased is the Apparent Error Rate of a Prediction Rule , 1986 .

[14]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[15]  B. Efron Estimation and Accuracy After Model Selection , 2014, Journal of the American Statistical Association.

[16]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[17]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[18]  Trevor Hastie,et al.  Cross-validation: what does it estimate and how well does it do it? , 2021, 2104.00673.

[19]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[20]  B. Efron The Estimation of Prediction Error , 2004 .