Unifying the error-correcting and output-code AdaBoost within the margin framework

In this paper, we present a new interpretation of AdaBoost.ECC and AdaBoost.OC. We show that AdaBoost.ECC performs stage-wise functional gradient descent on a cost function, defined in the domain of margin values, and that AdaBoost.OC is a shrinkage version of AdaBoost.ECC. These findings strictly explain some properties of the two algorithms. The gradient-minimization formulation of AdaBoost.ECC allows us to derive a new algorithm, referred to as AdaBoost.SECC, by explicitly exploiting shrinkage as regularization in AdaBoost.ECC. Experiments on diverse databases confirm our theoretical findings. Empirical results show that AdaBoost.SECC performs significantly better than AdaBoost.ECC and AdaBoost.OC.

[1]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[2]  Leo Breiman,et al.  Prediction Games and Arcing Algorithms , 1999, Neural Computation.

[3]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[4]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[5]  Koby Crammer,et al.  On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.

[6]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[7]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[8]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[9]  Peter L. Bartlett,et al.  Functional Gradient Techniques for Combining Hypotheses , 2000 .

[10]  Cynthia Rudin,et al.  The Dynamics of AdaBoost: Cyclic Behavior and Convergence of Margins , 2004, J. Mach. Learn. Res..

[11]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[12]  Geo. R. Lawrence Co. Santa Cruz, California , 1906 .

[13]  Dale Schuurmans,et al.  Boosting in the Limit: Maximizing the Margin of Learned Ensembles , 1998, AAAI/IAAI.

[14]  Robert E. Schapire,et al.  Using output codes to boost multiclass learning problems , 1997, ICML.

[15]  Venkatesan Guruswami,et al.  Multiclass learning, boosting, and error-correcting codes , 1999, COLT '99.

[16]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[17]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .