Bagging with adaptive costs

Ensemble methods have proved to be highly effective in improving the performance of base learners under most circumstances. In this paper, we propose a new algorithm that combines the merits of some existing techniques, namely bagging, arcing and stacking. The basic structure of the algorithm resembles bagging, using a linear support vector machine (SVM). However, the misclassification cost of each training point is repeatedly adjusted according to its observed out-of-bag vote margin. In this way, the method gains the advantage of arcing - building the classifier the ensemble needs - without fixating on potentially noisy points. Computational experiments show that this algorithm performs consistently better than bagging and arcing.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[3]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[4]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[5]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[6]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[7]  Dale Schuurmans,et al.  Boosting in the Limit: Maximizing the Margin of Learned Ensembles , 1998, AAAI/IAAI.

[8]  Kagan Tumer,et al.  Linear and order statistics combiners for reliable pattern classification , 1996 .

[9]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[10]  L. Breiman Arcing Classifiers , 1998 .

[11]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[12]  Yoram Singer,et al.  Boosting Applied to Tagging and PP Attachment , 1999, EMNLP.

[13]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[14]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[15]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[16]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[17]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[18]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[19]  Paul S. Bradley,et al.  Mathematical Programming for Data Mining: Formulations and Challenges , 1999, INFORMS J. Comput..

[20]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[21]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.