Consistent algorithms for multiclass classification with an abstain option

We consider the problem of n-class classification (n ≥ 2), where the classifier can choose to abstain from making predictions at a given cost, say, a factor α of the cost of misclassification. Our goal is to design consistent algorithms for such n-class classification problems with a ‘reject option’; while such algorithms are known for the binary (n = 2) case, little has been understood for the general multiclass case. We show that the well known Crammer-Singer surrogate and the one-vs-all hinge loss, albeit with a different predictor than the standard argmax, yield consistent algorithms for this problem when α = 1 2 . More interestingly, we design a new convex surrogate, which we call the binary encoded predictions surrogate, that is also consistent for this problem when α = 1 2 and operates on a much lower dimensional space (log(n) as opposed to n). We also construct modified versions of all these three surrogates to be consistent for any given α ∈ [0, 1 2 ]. MSC 2010 subject classifications: Primary 62H30; secondary 68T10.

[1]  C. K. Chow,et al.  On optimum recognition error and reject tradeoff , 1970, IEEE Trans. Inf. Theory.

[2]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[3]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[4]  Fabio Roli,et al.  Reject option with multiple thresholds , 2000, Pattern Recognit..

[5]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[6]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[7]  Fabio Roli,et al.  Support Vector Machines with Embedded Reject Option , 2002, SVM.

[8]  Fabio Roli,et al.  Classification with reject option in text categorisation systems , 2003, 12th International Conference on Image Analysis and Processing, 2003.Proceedings..

[9]  Yi Lin Multicategory Support Vector Machines, Theory, and Application to the Classification of . . . , 2003 .

[10]  Tong Zhang,et al.  Statistical Analysis of Some Multi-Category Large Margin Classification Methods , 2004, J. Mach. Learn. Res..

[11]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[12]  Fabio Roli,et al.  Analysis of error-reject trade-off in linearly combined multiple classifiers , 2004, Pattern Recognit..

[13]  Ambuj Tewari,et al.  On the Consistency of Multiclass Classification Methods , 2007, J. Mach. Learn. Res..

[14]  Qiang Wu,et al.  A Novel Classification-Rejection Sphere SVMs for Multi-class Classification Problems , 2007, Third International Conference on Natural Computation (ICNC 2007).

[15]  Yves Grandvalet,et al.  Support Vector Machines with a Reject Option , 2008, NIPS.

[16]  Peter L. Bartlett,et al.  Classification with a Reject Option using a Hinge Loss , 2008, J. Mach. Learn. Res..

[17]  Yoram Singer,et al.  Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[18]  Ming Yuan,et al.  Classification Methods with Reject Option Based on Convex Risk Minimization , 2010, J. Mach. Learn. Res..

[19]  Ran El-Yaniv,et al.  On the Foundations of Noise-free Selective Classification , 2010, J. Mach. Learn. Res..

[20]  Kush R. Varshney,et al.  Classification Using Geometric Level Sets , 2010, J. Mach. Learn. Res..

[21]  Chao Zou,et al.  Cost-sensitive Multi-class SVM with Reject Option: A Method for Steam Turbine Generator Fault Diagnosis , 2011 .

[22]  Ran El-Yaniv,et al.  Agnostic Selective Classification , 2011, NIPS.

[23]  Claudio Marrocco,et al.  Design of reject rules for ECOC classification systems , 2012, Pattern Recognit..

[24]  Shivani Agarwal,et al.  Classification Calibration Dimension for General Multiclass Losses , 2012, NIPS.

[25]  Ambuj Tewari,et al.  Convex Calibrated Surrogates for Low-Rank Loss Matrices with Applications to Subset Ranking Losses , 2013, NIPS.

[26]  Chih-Jen Lin,et al.  Iteration complexity of feasible descent methods for convex optimization , 2014, J. Mach. Learn. Res..

[27]  Ambuj Tewari,et al.  Convex Calibrated Surrogates for Hierarchical Classification , 2015, ICML.

[28]  Mehryar Mohri,et al.  Learning with Rejection , 2016, ALT.

[29]  Mark D. Reid,et al.  Composite Multiclass Losses , 2011, J. Mach. Learn. Res..

[30]  Barbara Hammer,et al.  Local Reject Option for Deterministic Multi-class SVM , 2016, ICANN.

[31]  Chong Zhang,et al.  On Reject and Refine Options in Multicategory Classification , 2017, 1701.02265.