Using Iterated Bagging to Debias Regressions

Breiman (Machine Learning, 26(2), 123–140) showed that bagging could effectively reduce the variance of regression predictors, while leaving the bias relatively unchanged. A new form of bagging we call iterated bagging is effective in reducing both bias and variance. The procedure works in stages—the first stage is bagging. Based on the outcomes of the first stage, the output values are altered; and a second stage of bagging is carried out using the altered output values. This is repeated until a simple rule stops the process. The method is tested using both trees and nearest neighbor regression methods. Accuracy on the Boston Housing data benchmark is comparable to the best of the results gotten using highly tuned and compute- intensive Support Vector Regression Machines. Some heuristic theory is given to clarify what is going on. Application to two-class classification data gives interesting results.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  Robert Tibshirani,et al.  Bias, Variance and Prediction Error for Classification Rules , 1996 .

[3]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[4]  Leo Breiman,et al.  HALF&HALF BAGGING AND HARD BOUNDARY POINTS , 1998 .

[5]  L. Breiman OUT-OF-BAG ESTIMATION , 1996 .

[6]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[7]  J. Weston,et al.  Support vector regression with ANOVA decomposition kernels , 1999 .

[8]  David H. Wolpert,et al.  An Efficient Method To Estimate Bagging's Generalization Error , 1999, Machine Learning.

[9]  Harris Drucker,et al.  Improving Regressors using Boosting Techniques , 1997, ICML.

[10]  L. Breiman Arcing Classifiers , 1998 .

[11]  Leo Breiman,et al.  Hinging hyperplanes for regression, classification, and function approximation , 1993, IEEE Trans. Inf. Theory.

[12]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[13]  Amanda J. C. Sharkey,et al.  On Combining Artificial Neural Nets , 1996, Connect. Sci..

[14]  Bernhard Schölkopf,et al.  Shrinking the Tube: A New Support Vector Regression Algorithm , 1998, NIPS.

[15]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[16]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[17]  J. Friedman Multivariate adaptive regression splines , 1990 .

[18]  R. Tibshirani,et al.  Additive Logistic Regression : a Statistical View ofBoostingJerome , 1998 .

[19]  Dan Steinberg,et al.  Stochastic Gradient Boosting: An Introduction to TreeNet™ , 2002, AusDM.

[20]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[21]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .