Pruning in Ordered Regression Bagging Ensembles

An efficient procedure for pruning regression ensembles is introduced. Starting from a bagging ensemble, pruning proceeds by ordering the regressors in the original ensemble and then selecting a subset for aggregation. Ensembles of increasing size are built by including first the regressors that perform best when aggregated. This strategy gives an approximate solution to the problem of extracting from the original ensemble the minimum error subensemble, which we prove to be NP-hard. Experiments show that pruned ensembles with only 20% of the initial regressors achieve better generalization accuracies than the complete bagging ensembles. The performance of pruned ensembles is analyzed by means of the bias-variance decomposition of the error.

[1]  Thomas G. Dietterich,et al.  Pruning Adaptive Boosting , 1997, ICML.

[2]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[3]  Alberto Suárez,et al.  Aggregation Ordering in Bagging , 2004 .

[4]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[5]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[6]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[7]  Tom Heskes,et al.  Clustering ensembles of neural network models , 2003, Neural Networks.

[8]  J. Freidman,et al.  Multivariate adaptive regression splines , 1991 .

[9]  L. Cooper,et al.  When Networks Disagree: Ensemble Methods for Hybrid Neural Networks , 1992 .

[10]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[11]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[12]  Peter Tiño,et al.  Managing Diversity in Regression Ensembles , 2005, J. Mach. Learn. Res..

[13]  Anders Krogh,et al.  A Simple Weight Decay Can Improve Generalization , 1991, NIPS.

[14]  Naonori Ueda,et al.  Generalization error of ensemble estimators , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[15]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[16]  Trevor Hastie,et al.  Statistical Models in S , 1991 .

[17]  Harris Drucker,et al.  Improving Regressors using Boosting Techniques , 1997, ICML.

[18]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[19]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[20]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[21]  L. Breiman USING ADAPTIVE BAGGING TO DEBIAS REGRESSIONS , 1999 .

[22]  Yoshua Bengio,et al.  No Unbiased Estimator of the Variance of K-Fold Cross-Validation , 2003, J. Mach. Learn. Res..