Cost-sensitive Multiclass Classification Risk Bounds

A commonly used approach to multiclass classification is to replace the 0 - 1 loss with a convex surrogate so as to make empirical risk minimization computationally tractable. Previous work has uncovered sufficient and necessary conditions for the consistency of the resulting procedures. In this paper, we strengthen these results by showing how the 0 - 1 excess loss of a predictor can be upper bounded as a function of the excess loss of the predictor measured using the convex surrogate. The bound is developed for the case of cost-sensitive multiclass classification and a convex surrogate loss that goes back to the work of Lee, Lin and Wahba. The bounds are as easy to calculate as in binary classification. Furthermore, we also show that our analysis extends to the analysis of the recently introduced "Simplex Coding" scheme.

[1]  Leonard M. Adleman,et al.  Proof of proposition 3 , 1992 .

[2]  Hans Ulrich Simon,et al.  Robust Trainability of Single Neurons , 1995, J. Comput. Syst. Sci..

[3]  Yi Lin Multicategory Support Vector Machines, Theory, and Application to the Classification of . . . , 2003 .

[4]  Lorenzo Rosasco,et al.  Are Loss Functions All the Same? , 2004, Neural Computation.

[5]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[6]  Tong Zhang,et al.  Statistical Analysis of Some Multi-Category Large Margin Classification Methods , 2004, J. Mach. Learn. Res..

[7]  Ambuj Tewari,et al.  On the Consistency of Multiclass Classification Methods , 2007, J. Mach. Learn. Res..

[8]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[9]  Tao Sun,et al.  Consistency of Multiclass Empirical Risk Minimization Methods Based on Convex Loss , 2006, J. Mach. Learn. Res..

[10]  Yufeng Liu,et al.  Fisher Consistency of Multicategory Support Vector Machines , 2007, AISTATS.

[11]  Ingo Steinwart How to Compare Different Loss Functions and Their Risks , 2007 .

[12]  Mark D. Reid,et al.  Surrogate regret bounds for proper losses , 2009, ICML '09.

[13]  Lorenzo Rosasco,et al.  Multiclass Learning with Simplex Coding , 2012, NIPS.

[14]  Peng Sun,et al.  The Convexity and Design of Composite Multiclass Losses , 2012, ICML.

[15]  Nathan Srebro,et al.  Minimizing The Misclassification Error Rate Using a Surrogate Convex Loss , 2012, ICML.

[16]  Shivani Agarwal,et al.  Classification Calibration Dimension for General Multiclass Losses , 2012, NIPS.