Using all data to generate decision tree ensembles

This paper develops a new method to generate ensembles of classifiers that uses all available data to construct every individual classifier. The base algorithm builds a decision tree in an iterative manner: The training data are divided into two subsets. In each iteration, one subset is used to grow the decision tree, starting from the decision tree produced by the previous iteration. This fully grown tree is then pruned by using the other subset. The roles of the data subsets are interchanged in every iteration. This process converges to a final tree that is stable with respect to the combined growing and pruning steps. To generate a variety of classifiers for the ensemble, we randomly create the subsets needed by the iterative tree construction algorithm. The method exhibits good performance in several standard datasets at low computational cost.

[1]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .

[2]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[4]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[5]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[6]  Robert E. Schapire,et al.  Using output codes to boost multiclass learning problems , 1997, ICML.

[7]  Thomas G. Dietterich Machine-Learning Research Four Current Directions , 1997 .

[8]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[9]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[10]  L. Breiman Arcing Classifiers , 1998 .

[11]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[12]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[13]  Leo Breiman,et al.  Randomizing Outputs to Increase Prediction Accuracy , 2000, Machine Learning.

[14]  Kagan Tumer,et al.  Error Correlation and Error Reduction in Ensemble Classifiers , 1996, Connect. Sci..

[15]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[16]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[17]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[18]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[19]  Thomas G. Dietterich Machine-Learning Research , 1997, AI Mag..

[20]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[21]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[22]  Geoffrey I. Webb,et al.  MultiBoosting: A Technique for Combining Boosting and Wagging , 2000, Machine Learning.

[23]  Roger Sauter,et al.  Introduction to Probability and Statistics for Engineers and Scientists , 2005, Technometrics.

[24]  Alberto Suárez,et al.  Globally Optimal Fuzzy Decision Trees for Classification and Regression , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Edward J. Delp,et al.  An iterative growing and pruning algorithm for classification tree design , 1989, Conference Proceedings., IEEE International Conference on Systems, Man and Cybernetics.

[26]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[27]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[28]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.