On a Unified Framework for Sampling With and Without Replacement in Decision Tree Ensembles

Classifier ensembles is an active area of research within the machine learning community. One of the most successful techniques is bagging, where an algorithm (typically a decision tree inducer) is applied over several different training sets, obtained applying sampling with replacement to the original database. In this paper we define a framework where sampling with and without replacement can be viewed as the extreme cases of a more general process, and analyze the performance of the extension of bagging to such framework.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[3]  Pedro Larrañaga,et al.  Using Bayesian networks in the construction of a bi-level multi-classifier. A case study using intensive care unit patients data , 2001, Artif. Intell. Medicine.

[4]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[5]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[6]  Pierre Loonis,et al.  Combination, Cooperation And Selection Of Classifiers: A State Of The Art , 2003, Int. J. Pattern Recognit. Artif. Intell..

[7]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[8]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[9]  Ron Kohavi,et al.  Data Mining Using MLC a Machine Learning Library in C++ , 1996, Int. J. Artif. Intell. Tools.

[10]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[11]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[12]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[13]  Yi Lu,et al.  Knowledge integration in a multiple classifier system , 2004, Applied Intelligence.

[14]  Thomas G. Dietterich Machine-Learning Research Four Current Directions , 1997 .

[15]  João Gama,et al.  Combining classification algorithms , 2000 .

[16]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[17]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Thomas G. Dietterich Machine-Learning Research , 1997, AI Mag..