A multi-class boosting method with direct optimization

We present a direct multi-class boosting (DMCBoost) method for classification with the following properties: (i) instead of reducing the multi-class classification task to a set of binary classification tasks, DMCBoost directly solves the multi-class classification problem, and only requires very weak base classifiers; (ii) DMCBoost builds an ensemble classifier by directly optimizing the non-convex performance measures, including the empirical classification error and margin functions, without resorting to any upper bounds or approximations. As a non-convex optimization method, DMCBoost shows competitive or better results than state-of-the-art convex relaxation boosting methods, and it performs especially well on the noisy cases.

[1]  Tian Xia,et al.  Direct 0-1 Loss Minimization and Margin Maximization with Boosting , 2013, NIPS.

[2]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[3]  H. Zou,et al.  NEW MULTICATEGORY BOOSTING ALGORITHMS BASED ON MULTICATEGORY FISHER-CONSISTENT LOSSES. , 2008, The annals of applied statistics.

[4]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[5]  Katta G. Murty,et al.  Nonlinear Programming Theory and Algorithms , 2007, Technometrics.

[6]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[7]  Zhi-Hua Zhou,et al.  On the doubt about margin explanation of boosting , 2010, Artif. Intell..

[8]  Ping Li,et al.  Robust LogitBoost and Adaptive Base Class (ABC) LogitBoost , 2010, UAI.

[9]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[10]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[11]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[12]  Trevor Hastie,et al.  Multi-class AdaBoost ∗ , 2009 .

[13]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[14]  Scott Sanner,et al.  Algorithms for Direct 0-1 Loss Optimization in Binary Classification , 2013, ICML.

[15]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[16]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[17]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[18]  Hector A. Rosales-Macedo Nonlinear Programming: Theory and Algorithms (2nd Edition) , 1993 .

[19]  Ayhan Demiriz,et al.  Linear Programming Boosting via Column Generation , 2002, Machine Learning.

[20]  Philip D. Plowright,et al.  Convexity , 2019, Optimization for Chemical and Biochemical Engineering.

[21]  Cynthia Rudin,et al.  The Dynamics of AdaBoost: Cyclic Behavior and Convergence of Margins , 2004, J. Mach. Learn. Res..

[22]  Pietro Perona,et al.  Quickly Boosting Decision Trees - Pruning Underachieving Features Early , 2013, ICML.

[23]  Gunnar Rätsch,et al.  Boosting Algorithms for Maximizing the Soft Margin , 2007, NIPS.

[24]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[25]  Ping Li,et al.  ABC-boost: adaptive base class boost for multi-class classification , 2008, ICML '09.

[26]  Dimitri P. Bertsekas,et al.  Network optimization : continuous and discrete models , 1998 .

[27]  David Mease,et al.  Evidence Contrary to the Statistical View of Boosting , 2008, J. Mach. Learn. Res..

[28]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[29]  Rocco A. Servedio,et al.  Random classification noise defeats all convex potential boosters , 2008, ICML '08.

[30]  Tamir Hazan,et al.  Direct Loss Minimization for Structured Prediction , 2010, NIPS.

[31]  Alex M. Andrew,et al.  Boosting: Foundations and Algorithms , 2012 .

[32]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[33]  Robert E. Schapire,et al.  A theory of multiclass boosting , 2010, J. Mach. Learn. Res..

[34]  Yoram Singer,et al.  On the equivalence of weak learnability and linear separability: new relaxations and efficient boosting algorithms , 2010, Machine Learning.

[35]  Nuno Vasconcelos,et al.  Multiclass Boosting: Theory and Algorithms , 2011, NIPS.

[36]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[37]  Zhi-Hua Zhou,et al.  A Refined Margin Analysis for Boosting Algorithms via Equilibrium Margin , 2011, J. Mach. Learn. Res..