Three AdaBoost variants are distinguished based on the strategies applied to update the weights for each new ensemble member. The classic AdaBoost due to Freund and Schapire only decreases the weights of the correctly classified objects and is conservative in this sense. All the weights are then updated through a normalization step. Other AdaBoost variants in the literature update all the weights before renormalizing (aggressive variant). Alternatively we may increase only the weights of misclassified objects and then renormalize (the second conservative variant). The three variants have different bounds on their training errors. This could indicate different generalization performances. The bounds are derived here following the proof by Freund and Schapire for the classical AdaBoost for multiple classes (AdaBoost.M1), and compared against each other. The aggressive variant and the less popular of the two conservative variants have lower error bounds than the classical AdaBoost. Also, whereas the coefficients βi in the classical AdaBoost are found as the unique solution of a minimization problem on the bound, the aggressive and the second conservative variants have monotone increasing functions of βi (0 le; βi ≤ 1) as their bounds, giving infinitely many choices of βi.
[1]
David G. Stork,et al.
Pattern Classification
,
1973
.
[2]
Eric Bauer,et al.
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants
,
1999,
Machine Learning.
[3]
Shigeo Abe DrEng.
Pattern Classification
,
2001,
Springer London.
[4]
Robert E. Schapire,et al.
The Boosting Approach to Machine Learning An Overview
,
2003
.
[5]
Robert E. Schapire,et al.
Theoretical Views of Boosting
,
1999,
EuroCOLT.
[6]
Yoav Freund,et al.
A decision-theoretic generalization of on-line learning and an application to boosting
,
1995,
EuroCOLT.