Improved Boosting algorithm with adaptive filtration

AdaBoost is known as an effective method to improve the performance of base classifiers both theoretically and empirically. However, previous studies have shown that AdaBoost is always prone to overfitting especially in noisy case. In addition, most current works on Boosting assume that the loss function is fixed and therefore do not take the distinction between noisy case and noise-free case into consideration. In this paper, an improved Boosting algorithm with adaptive filtration is proposed. A filtering algorithm is designed firstly based on Hoeffding Inequality to identify mislabeled or atypical samples. By introducing the filtering algorithm, we manage to modify the loss function such that influences of mislabeled or atypical samples are penalized. Experiments performed on eight different UCI data sets show that the new Boosting algorithm almost always obtains considerably better classification accuracy than AdaBoost. Furthermore, experiments on data with artificially controlled noise indicate that the new Boosting algorithm is more robust to noise than AdaBoost.

[1]  Harris Drucker,et al.  Boosting and Other Ensemble Methods , 1994, Neural Computation.

[2]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[3]  J. Ross Quinlan,et al.  Boosting First-Order Learning , 1996, ALT.

[4]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[5]  Heinz H. Bauschke,et al.  Legendre functions and the method of random Bregman projections , 1997 .

[6]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[7]  L. Breiman Arcing Classifiers , 1998 .

[8]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[9]  Peter L. Bartlett,et al.  Boosting Algorithms as Gradient Descent , 1999, NIPS.

[10]  Robert E. Schapire,et al.  Theoretical Views of Boosting , 1999, EuroCOLT.

[11]  Yoram Singer,et al.  Boosting Applied to Tagging and PP Attachment , 1999, EMNLP.

[12]  Leo Breiman,et al.  Prediction Games and Arcing Algorithms , 1999, Neural Computation.

[13]  Yoav Freund,et al.  An Adaptive Version of the Boost by Majority Algorithm , 1999, COLT.

[14]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[15]  Osamu Watanabe,et al.  MadaBoost: A Modification of AdaBoost , 2000, COLT.

[16]  Gunnar Rätsch,et al.  An Introduction to Boosting and Leveraging , 2002, Machine Learning Summer School.

[17]  Dmitry Gavinsky,et al.  On Boosting with Polynomially Bounded Distributions , 2002, J. Mach. Learn. Res..

[18]  Nikunj C. Oza Boosting with Averaged Weight Vectors , 2003, Multiple Classifier Systems.

[19]  Rocco A. Servedio,et al.  Smooth boosting and learning with malicious noise , 2003 .

[20]  Luo Si,et al.  A New Boosting Algorithm Using Input-Dependent Regularizer , 2003, ICML 2003.

[21]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[22]  Peter L. Bartlett,et al.  Improved Generalization Through Explicit Optimization of Margins , 2000, Machine Learning.

[23]  Kuniaki Uehara,et al.  Improvement of Boosting Algorithm by Modifying the Weighting Rule , 2004, Annals of Mathematics and Artificial Intelligence.

[24]  Nikunj C. Oza,et al.  AveBoost2: Boosting for Noisy Data , 2004, Multiple Classifier Systems.

[25]  Takafumi Kanamori,et al.  Robust Loss Functions for Boosting , 2007, Neural Computation.

[26]  Alexander Vezhnevets,et al.  Avoiding Boosting Overfitting by Removing Confusing Samples , 2007, ECML.

[27]  C. Oza,et al.  AveBoost 2 : Boosting for Noisy Data , .