Binary Classification with Karmic, Threshold-Quasi-Concave Metrics

Complex performance measures, beyond the popular measure of accuracy, are increasingly being used in the context of binary classification. These complex performance measures are typically not even decomposable, that is, the loss evaluated on a batch of samples cannot typically be expressed as a sum or average of losses evaluated at individual samples, which in turn requires new theoretical and methodological developments beyond standard treatments of supervised learning. In this paper, we advance this understanding of binary classification for complex performance measures by identifying two key properties: a so-called Karmic property, and a more technical threshold-quasi-concavity property, which we show is milder than existing structural assumptions imposed on performance measures. Under these properties, we show that the Bayes optimal classifier is a threshold function of the conditional probability of positive class. We then leverage this result to come up with a computationally practical plug-in classifier, via a novel threshold estimator, and further, provide a novel statistical analysis of classification error with respect to complex performance measures.

[1]  William M. Shaw,et al.  On the foundation of evaluation , 1986, J. Am. Soc. Inf. Sci..

[2]  Nan Ye,et al.  Optimizing F-measure: A Tale of Two Approaches , 2012, ICML.

[3]  Prateek Jain,et al.  Online and Stochastic Gradient Methods for Non-decomposable Loss Functions , 2014, NIPS.

[4]  A. Tsybakov,et al.  Fast learning rates for plug-in classifiers , 2007, 0708.2321.

[5]  Pradeep Ravikumar,et al.  Fast Classification Rates for High-dimensional Gaussian Generative Models , 2015, NIPS.

[6]  W. Polonik Measuring Mass Concentrations and Estimating Density Contour Clusters-An Excess Mass Approach , 1995 .

[7]  Heinrich Jiang,et al.  Uniform Convergence Rates for Kernel Density Estimation , 2017, ICML.

[8]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[9]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[10]  Oluwasanmi Koyejo,et al.  Consistent Binary Classification with Generalized Performance Metrics , 2014, NIPS.

[11]  Harikrishna Narasimhan,et al.  Consistent Multiclass Algorithms for Complex Performance Measures , 2015, ICML.

[12]  Nikolaos M. Avouris,et al.  EVALUATION OF CLASSIFIERS FOR AN UNEVEN CLASS DISTRIBUTION PROBLEM , 2006, Appl. Artif. Intell..

[13]  Martin Jansche,et al.  A Maximum Expected Utility Framework for Binary Sequence Labeling , 2007, ACL.

[14]  Prateek Jain,et al.  Optimizing Non-decomposable Performance Measures: A Tale of Two Classes , 2015, ICML.

[15]  Zhihua Cai,et al.  Evaluation Measures of the Classification Performance of Imbalanced Data Sets , 2009 .

[16]  Kenneth Kennedy,et al.  Learning without Default: A Study of One-Class Classification and the Low-Default Portfolio Problem , 2009, AICS.

[17]  Carla E. Brodley,et al.  Class Imbalance, Redux , 2011, 2011 IEEE 11th International Conference on Data Mining.

[18]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[19]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[20]  Harikrishna Narasimhan,et al.  On the Statistical Consistency of Plug-in Classifiers for Non-decomposable Performance Measures , 2014, NIPS.

[21]  E. Mammen,et al.  Smooth Discrimination Analysis , 1999 .

[22]  Yves Grandvalet,et al.  Theory of Optimizing Pseudolinear Performance Measures: Application to F-measure , 2015, ArXiv.

[23]  David D. Lewis,et al.  Evaluating and optimizing autonomous text classification systems , 1995, SIGIR '95.

[24]  Sanjay Chawla,et al.  On the Statistical Consistency of Algorithms for Binary Classification under Class Imbalance , 2013, ICML.

[25]  Eyke Hüllermeier,et al.  Online F-Measure Optimization , 2015, NIPS.

[26]  Narayanan Unny Edakunni,et al.  Beyond Fano's inequality: bounds on the optimal F-score, BER, and cost-sensitive risk and their implications , 2013, J. Mach. Learn. Res..

[27]  W. Youden,et al.  Index for rating diagnostic tests , 1950, Cancer.