ABC-boost: adaptive base class boost for multi-class classification

We propose <b><i>abc-boost</i></b> (adaptive base class boost) for multi-class classification and present <b><i>abc-mart</i></b>, an implementation of <i>abc-boost</i>, based on the multinomial logit model. The key idea is that, at each boosting iteration, we <b><i>adaptively</i></b> and greedily choose a <b><i>base</i></b> class. Our experiments on public datasets demonstrate the improvement of <i>abc-mart</i> over the original <i>mart</i> algorithm.

[1]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[2]  C. Burges,et al.  Learning to Rank Using Classification and Gradient Boosting , 2008 .

[3]  Tong Zhang,et al.  Statistical Analysis of Some Multi-Category Large Margin Classification Methods , 2004, J. Mach. Learn. Res..

[4]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1990, COLT '90.

[5]  David Mease,et al.  Evidence Contrary to the Statistical View of Boosting , 2008, J. Mach. Learn. Res..

[6]  Peter L. Bartlett,et al.  Boosting Algorithms as Gradient Descent , 1999, NIPS.

[7]  Tong Zhang,et al.  Subset Ranking Using Regression , 2006, COLT.

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[9]  Ambuj Tewari,et al.  On the Consistency of Multiclass Classification Methods , 2007, J. Mach. Learn. Res..

[10]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[11]  H. Zou,et al.  NEW MULTICATEGORY BOOSTING ALGORITHMS BASED ON MULTICATEGORY FISHER-CONSISTENT LOSSES. , 2008, The annals of applied statistics.

[12]  Hongyuan Zha,et al.  A General Boosting Method and its Application to Learning Ranking Functions for Web Search , 2007, NIPS.

[13]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[14]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[15]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[16]  Peter Buhlmann,et al.  BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING , 2007, 0804.2752.

[17]  Hongyuan Zha,et al.  A regression framework for learning ranking functions using relative relevance judgments , 2007, SIGIR.

[18]  P. Bühlmann,et al.  Boosting with the L2-loss: regression and classification , 2001 .

[19]  Ping Li,et al.  Robust LogitBoost and Adaptive Base Class (ABC) LogitBoost , 2010, UAI.

[20]  Jason Weston,et al.  Fast Kernel Classifiers with Online and Active Learning , 2005, J. Mach. Learn. Res..

[21]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[22]  Trevor Hastie,et al.  Multi-class AdaBoost ∗ , 2009 .

[23]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[24]  Yi Lin Multicategory Support Vector Machines, Theory, and Application to the Classification of . . . , 2003 .

[25]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[26]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[27]  Qiang Wu,et al.  McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.

[28]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.