Bagging Decision Multi-trees

Ensemble methods improve accuracy by combining the predictions of a set of different hypotheses. A well-known method for generating hypothesis ensembles is Bagging. One of the main drawbacks of ensemble methods in general, and Bagging in particular, is the huge amount of computational resources required to learn, store, and apply the set of models. Another problem is that even using the bootstrap technique, many simple models are similar, so limiting the ensemble diversity. In this work, we investigate an optimization technique based on sharing the common parts of the models from an ensemble formed by decision trees in order to minimize both problems. Concretely, we employ a structure called decision multi-tree which can contain simultaneously a set of decision trees and hence consider just once the ”repeated” parts. A thorough experimental evaluation is included to show that the proposed optimisation technique pays off in practice.

[1]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[2]  José Hernández-Orallo,et al.  SMILES: A Multi-purpose Learning System , 2002, JELIA.

[3]  Ron Kohavi,et al.  Option Decision Trees with Majority Votes , 1997, ICML.

[4]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[5]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[6]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[7]  Paul W. Munro,et al.  Improving Committee Diagnosis with Resampling Techniques , 1995, NIPS.

[8]  Nils J. Nilsson,et al.  Artificial Intelligence: A New Synthesis , 1997 .

[9]  Miguel Toro,et al.  Advances in Artificial Intelligence — IBERAMIA 2002 , 2002, Lecture Notes in Computer Science.

[10]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[11]  Jesfis Peral,et al.  Heuristics -- intelligent search strategies for computer problem solving , 1984 .

[12]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[13]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  José Hernández-Orallo,et al.  Beam Search Extraction and Forgetting Strategies on Shared Ensembles , 2003, Multiple Classifier Systems.

[16]  José Hernández-Orallo,et al.  Shared Ensemble Learning Using Multi-trees , 2002, IBERAMIA.

[17]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .

[18]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[20]  Frank Wolter,et al.  Semi-qualitative Reasoning about Distances: A Preliminary Report , 2000, JELIA.

[21]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.