Experiments with a New Boosting Algorithm

In an earlier paper, we introduced a new "boosting" algorithm called AdaBoost which, theoretically, can be used to significantly reduce the error of any learning algorithm that con- sistently generates classifiers whose performance is a little better than random guessing. We also introduced the related notion of a "pseudo-loss" which is a method for forcing a learning algorithm of multi-label concepts to concentrate on the labels that are hardest to discriminate. In this paper, we describe experiments we carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems. We performed two sets of experiments. The first set compared boosting to Breiman's "bagging" method when used to aggregate various classifiers (including decision trees and single attribute- value tests). We compared the performance of the two methods on a collection of machine-learning benchmarks. In the second set of experiments, we studied in more detail the performance of boosting using a nearest-neighbor classifier on an OCR problem.

[1]  C. G. Hilborn,et al.  The Condensed Nearest Neighbor Rule , 1967 .

[2]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[3]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1990, COLT '90.

[4]  Harris Drucker,et al.  Improving Performance in Neural Networks Using a Boosting Algorithm , 1992, NIPS.

[5]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[6]  Yann LeCun,et al.  Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.

[7]  Harris Drucker,et al.  Boosting Performance in Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..

[8]  Harris Drucker,et al.  Boosting and Other Ensemble Methods , 1994, Neural Computation.

[9]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[10]  Johannes Fürnkranz,et al.  Incremental Reduced Error Pruning , 1994, ICML.

[11]  Thomas G. Dietterich,et al.  Error-Correcting Output Coding Corrects Bias and Variance , 1995, ICML.

[12]  Mark Craven,et al.  Learning Sparse Perceptrons , 1995, NIPS.

[13]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[14]  Corinna Cortes,et al.  Boosting Decision Trees , 1995, NIPS.

[15]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[16]  Yishay Mansour,et al.  On the boosting ability of top-down decision tree learning algorithms , 1996, STOC '96.

[17]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[18]  Thomas G. Dietterich,et al.  Applying the Waek Learning Framework to Understand and Improve C4.5 , 1996, ICML.

[19]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .

[20]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[21]  G. Gates The Reduced Nearest Neighbor Rule , 1998 .