论文信息 - Selection of Decision Stumps in Bagging Ensembles

Selection of Decision Stumps in Bagging Ensembles

This article presents a comprehensive study of different ensemble pruning techniques applied to a bagging ensemble composed of decision stumps. Six different ensemble pruning methods are tested. Four of these are greedy strategies based on first reordering the elements of the ensemble according to some rule that takes into account the complementarity of the predictors with respect to the classification task. Subensembles of increasing size are then constructed by incorporating the ordered classifiers one by one. A halting criterion stops the aggregation process before the complete original ensemble is recovered. The other two approaches are selection techniques that attempt to identify optimal subensembles using either genetic algorithms or semidefinite programming. Experiments performed on 24 benchmark classification tasks show that the selection of a small subset (≅ 10-15%) of the original pool of stumps generated with bagging can significantly increase the accuracy and reduce the complexity of the ensemble.

Daniel Hernández-Lobato | Gonzalo Martínez-Muñoz | Alberto Suárez

[1] Thomas G. Dietterich,et al. Pruning Adaptive Boosting , 1997, ICML.

[2] Alberto Suárez,et al. Aggregation Ordering in Bagging , 2004 .

[3] Catherine Blake,et al. UCI Repository of machine learning databases , 1998 .

[4] Leo Breiman,et al. Classification and Regression Trees , 1984 .

[5] William Nick Street,et al. Ensemble Pruning Via Semi-definite Programming , 2006, J. Mach. Learn. Res..

[6] Gonzalo Martínez-Muñoz,et al. Pruning in ordered bagging ensembles , 2006, ICML.

[7] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.

[8] Wei Tang,et al. Selective Ensemble of Decision Trees , 2003, RSFDGrC.

[9] Robert Tibshirani,et al. An Introduction to the Bootstrap , 1994 .

[10] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[11] Scott D. Thomson,et al. An Introduction , 1977 .

[12] A. E. Eiben,et al. Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[13] Gonzalo Martínez-Muñoz,et al. Using boosting to prune bagging ensembles , 2007, Pattern Recognit. Lett..

[14] Wei Tang,et al. Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..