论文信息 - Efficient Margin Maximizing with Boosting

Efficient Margin Maximizing with Boosting

AdaBoost produces a linear combination of base hypotheses and predicts with the sign of this linear combination. The linear combination may be viewed as a hyperplane in feature space where the base hypotheses form the features. It has been observed that the generalization error of the algorithm continues to improve even after all examples are on the correct side of the current hyperplane. The improvement is attributed to the experimental observation that the distances (margins) of the examples to the separating hyperplane are increasing even after all examples are on the correct side.We introduce a new version of AdaBoost, called AdaBoost*ν, that explicitly maximizes the minimum margin of the examples up to a given precision. The algorithm incorporates a current estimate of the achievable margin into its calculation of the linear coefficients of the base hypotheses. The bound on the number of iterations needed by the new algorithms is the same as the number needed by a known version of AdaBoost that must have an explicit estimate of the achievable margin as a parameter. We also illustrate experimentally that our algorithm requires considerably fewer iterations than other algorithms that aim to maximize the margin.

Gunnar Rätsch | Manfred K. Warmuth | G. Rätsch

[1] Alberto Maria Segre,et al. Programs for Machine Learning , 1994 .

[2] Dale Schuurmans,et al. Boosting in the Limit: Maximizing the Margin of Learned Ensembles , 1998, AAAI/IAAI.

[3] Gunnar Rätsch,et al. Constructing Boosting Algorithms from SVMs: An Application to One-Class Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[4] Kenneth O. Kortanek,et al. Semi-Infinite Programming: Theory, Methods, and Applications , 1993, SIAM Rev..

[5] Olvi L. Mangasarian,et al. Arbitrary-norm separating plane , 1999, Oper. Res. Lett..

[6] Dmitry Panchenko,et al. Some New Bounds on the Generalization Error of Combined Classifiers , 2000, NIPS.

[7] Manfred K. Warmuth,et al. Boosting as entropy projection , 1999, COLT '99.

[8] Gunnar Rätsch,et al. Learning Interpretable SVMs for Biological Sequence Classification , 2005, BMC Bioinformatics.

[9] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[10] Yoram Singer,et al. Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[11] Michael I. Jordan,et al. Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.