The Importance of Knowing When to Stop

OBJECTIVES Component-wise boosting algorithms have evolved into a popular estimation scheme in biomedical regression settings. The iteration number of these algorithms is the most important tuning parameter to optimize their performance. To date, no fully automated strategy for determining the optimal stopping iteration of boosting algorithms has been proposed. METHODS We propose a fully data-driven sequential stopping rule for boosting algorithms. It combines resampling methods with a modified version of an earlier stopping approach that depends on AIC-based information criteria. The new "subsampling after AIC" stopping rule is applied to component-wise gradient boosting algorithms. RESULTS The newly developed sequential stopping rule outperformed earlier approaches if applied to both simulated and real data. Specifically, it improved purely AIC-based methods when used for the microarray-based prediction of the recurrence of metastases for stage II colon cancer patients. CONCLUSIONS The proposed sequential stopping rule for boosting algorithms can help to identify the optimal stopping iteration already during the fitting process of the algorithm, at least for the most common loss functions.

[1]  Anne-Laure Boulesteix,et al.  Stability Investigations of Multivariable Regression Models Derived from Low- and High-Dimensional Data , 2011, Journal of biopharmaceutical statistics.

[2]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[3]  Torsten Hothorn,et al.  Identifying Risk Factors for Severe Childhood Malnutrition by Boosting Additive Quantile Regression , 2011 .

[4]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[5]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[6]  T Hothorn,et al.  Weight estimation by three‐dimensional ultrasound imaging in the small fetus , 2008, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.

[7]  Torsten Hothorn,et al.  Model-based Boosting 2.0 , 2010, J. Mach. Learn. Res..

[8]  Trevor Hastie Comment: Boosting Algorithms: Regularization, Prediction and Model Fitting , 2007 .

[9]  T. Wassink,et al.  Statistical Epistasis and Progressive Brain Change in Schizophrenia: An Approach for Examining the Relationships Between Multiple Genes , 2011, Molecular Psychiatry.

[10]  P. Bühlmann,et al.  Boosting With the L2 Loss , 2003 .

[11]  H. Akaike A new look at the statistical model identification , 1974 .

[12]  B. Yu,et al.  Boosting with the L 2-loss regression and classification , 2001 .

[13]  David Mease,et al.  Evidence Contrary to the Statistical View of Boosting , 2008, J. Mach. Learn. Res..

[14]  M. Schimek Penalized binary regression for gene expression profiling. , 2004, Methods of information in medicine.

[15]  J. Wyatt,et al.  Commentary: Prognostic models: clinically useful or quickly forgotten? , 1995 .

[16]  Peter Buhlmann,et al.  BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING , 2007, 0804.2752.

[17]  M Schumacher,et al.  An Experimental Evaluation of Boosting Methods for Classification , 2010, Methods of Information in Medicine.

[18]  Marcel Dettling,et al.  BagBoosting for tumor classification with gene expression data , 2004, Bioinform..

[19]  R. Spang,et al.  Predicting the clinical status of human breast cancer by using gene expression profiles , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[20]  S. Greven,et al.  On the behaviour of marginal and conditional AIC in linear mixed models , 2010 .

[21]  J. Copas Regression, Prediction and Shrinkage , 1983 .

[22]  Torsten Hothorn,et al.  Prediction intervals for future BMI values of individual children - a non-parametric approach by quantile boosting , 2012, BMC Medical Research Methodology.

[23]  B. Efron Biased Versus Unbiased Estimation , 1975 .

[24]  Harald Binder,et al.  Adapting Prediction Error Estimates for Biased Complexity Selection in High-Dimensional Bootstrap Samples , 2008, Statistical applications in genetics and molecular biology.

[25]  Benjamin M. Bolstad,et al.  Preprocessing High-density Oligonucleotide Arrays , 2005 .

[26]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[27]  Ivan ChangYuan-Chin,et al.  Early stopping in L 2 Boosting , 2010 .

[28]  S. Dudoit,et al.  Stage II colon cancer prognosis prediction by tumor gene expression profiling. , 2006, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[29]  Schumacher Martin,et al.  Adapting Prediction Error Estimates for Biased Complexity Selection in High-Dimensional Bootstrap Samples , 2008 .

[30]  J. Davis Bioinformatics and Computational Biology Solutions Using R and Bioconductor , 2007 .

[31]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[32]  Peter Bühlmann,et al.  Boosting for Tumor Classification with Gene Expression Data , 2003, Bioinform..