Learning with Rejection

We introduce a novel framework for classification with a rejection option that consists of simultaneously learning two functions: a classifier along with a rejection function. We present a full theoretical analysis of this framework including new data-dependent learning bounds in terms of the Rademacher complexities of the classifier and rejection families as well as consistency and calibration results. These theoretical guarantees guide us in designing new algorithms that can exploit different kernel-based hypothesis sets for the classifier and rejection functions. We compare and contrast our general framework with the special case of confidence-based rejection for which we devise alternative loss functions and algorithms as well. We report the results of several experiments showing that our kernel-based algorithms can yield a notable improvement over the best existing confidence-based rejection algorithm.

[1]  Y. Mansour,et al.  Generalization bounds for averaged classifiers , 2004, math/0410092.

[2]  Philip M. Long,et al.  Consistency versus Realizable H-Consistency for Multiclass Classification , 2013, ICML.

[3]  Jason Weston,et al.  Combining classifiers for improved classification of proteins from sequence or structure , 2008, BMC Bioinformatics.

[4]  Kamalika Chaudhuri,et al.  The Extended Littlestone's Dimension for Learning with Mistakes and Abstentions , 2016, COLT.

[5]  V. Koltchinskii,et al.  Empirical margin distributions and bounding the generalization error of combined classifiers , 2002, math/0405343.

[6]  C. K. Chow,et al.  An optimum character recognition system using decision functions , 1957, IRE Trans. Electron. Comput..

[7]  Venkatesh Saligrama,et al.  Supervised Sequential Classification Under Budget Constraints , 2013, AISTATS.

[8]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[9]  Ran El-Yaniv,et al.  On the Foundations of Noise-free Selective Classification , 2010, J. Mach. Learn. Res..

[10]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[11]  Thomas J. Walsh,et al.  Knows what it knows: a framework for self-aware learning , 2008, ICML.

[12]  Thomas J. Walsh,et al.  Knows what it knows: a framework for self-aware learning , 2008, ICML '08.

[13]  Ming Yuan,et al.  Classification Methods with Reject Option Based on Convex Risk Minimization , 2010, J. Mach. Learn. Res..

[14]  Mehryar Mohri,et al.  Learning with Deep Cascades , 2015, ALT.

[15]  M. Talagrand,et al.  Probability in Banach Spaces: Isoperimetry and Processes , 1991 .

[16]  Pierre Beauseroy,et al.  A Kernel Based Rejection Method for Supervised Classification , 2007 .

[17]  Robert P. W. Duin,et al.  The interaction between classification and reject performance for distance-based reject-option classifiers , 2006, Pattern Recognit. Lett..

[18]  Hoel Le Capitaine,et al.  An Optimum Class-Rejective Decision Rule and Its Evaluation , 2010, 2010 20th International Conference on Pattern Recognition.

[19]  Ran El-Yaniv,et al.  Agnostic Selective Classification , 2011, NIPS.

[20]  Carla M. Santos-Pereira,et al.  On optimal reject rules and ROC curves , 2005, Pattern Recognit. Lett..

[21]  Fabio Roli,et al.  Multiple Reject Thresholds for Improving Classification Reliability , 2000, SSPR/SPR.

[22]  Venkatesh Saligrama,et al.  An LP for Sequential Learning Under Budgets , 2014, AISTATS.

[23]  Robert P. W. Duin,et al.  Growing a multi-class classifier with a reject option , 2008, Pattern Recognit. Lett..

[24]  Kamalika Chaudhuri,et al.  Beyond Disagreement-Based Agnostic Active Learning , 2014, NIPS.

[25]  Bernard Dubuisson,et al.  A statistical decision rule with incomplete knowledge about classes , 1993, Pattern Recognit..

[26]  M. Yuan,et al.  Support vector machines with a reject option , 2011, 1201.1140.

[27]  Peter L. Bartlett,et al.  Classification with a Reject Option using a Hinge Loss , 2008, J. Mach. Learn. Res..

[28]  Francesco Tortorella An Optimal Reject Rule for Binary Classifiers , 2000, SSPR/SPR.

[29]  Fabio Roli,et al.  Support Vector Machines with Embedded Reject Option , 2002, SVM.

[30]  C. K. Chow,et al.  On optimum recognition error and reject tradeoff , 1970, IEEE Trans. Inf. Theory.

[31]  Yves Grandvalet,et al.  Support Vector Machines with a Reject Option , 2008, NIPS.

[32]  Radu Herbei,et al.  Classification with reject option , 2006 .

[33]  Tadeusz Pietraszek,et al.  Optimizing abstaining classifiers using ROC analysis , 2005, ICML.