Optimizing F-Measures by Cost-Sensitive Classification

We present a theoretical analysis of F-measures for binary, multiclass and multilabel classification. These performance measures are non-linear, but in many scenarios they are pseudo-linear functions of the per-class false negative/false positive rate. Based on this observation, we present a general reduction of F-measure maximization to cost-sensitive classification with unknown costs. We then propose an algorithm with provable guarantees to obtain an approximately optimal classifier for the F-measure by solving a series of cost-sensitive classification problems. The strength of our analysis is to be valid on any dataset and any class of classifiers, extending the existing theoretical results on F-measures, which are asymptotic in nature. We present numerical experiments to illustrate the relative importance of cost asymmetry and thresholding when learning linear classifiers on various F-measure optimization tasks.

[1]  Willi Hock,et al.  Lecture Notes in Economics and Mathematical Systems , 1981 .

[2]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[3]  Vipin Kumar,et al.  Optimizing F-Measure with Support Vector Machines , 2003, FLAIRS Conference.

[4]  John Langford,et al.  An iterative method for multi-class cost-sensitive learning , 2004, KDD.

[5]  Samy Bengio,et al.  A Probabilistic Interpretation of SVMs with an Application to Unbalanced Classification , 2005, NIPS.

[6]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[7]  Eric Horvitz,et al.  Considering Cost Asymmetry in Learning Classifiers , 2006, J. Mach. Learn. Res..

[8]  Ingo Steinwart How to Compare Different Loss Functions and Their Risks , 2007 .

[9]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[10]  Chih-Jen Lin,et al.  A Study on Threshold Selection for Multi-label Classification , 2007 .

[11]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[12]  A. Cambini,et al.  Generalized Convexity and Optimization , 2009 .

[13]  Tibério S. Caetano,et al.  Reverse Multi-Label Learning , 2010, NIPS.

[14]  Zhi-Hua Zhou,et al.  ON MULTI‐CLASS COST‐SENSITIVE LEARNING , 2006, Comput. Intell..

[15]  Eyke Hüllermeier,et al.  An Exact Algorithm for F-Measure Maximization , 2011, NIPS.

[16]  Tibério S. Caetano,et al.  Submodular Multi-Label Learning , 2011, NIPS.

[17]  Nan Ye,et al.  Optimizing F-measure: A Tale of Two Approaches , 2012, ICML.

[18]  Eyke Hüllermeier,et al.  F-Measure Maximization in Topical Classification , 2012, RSCTC.

[19]  Fabio Roli,et al.  F-measure optimisation in multi-label classifiers , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[20]  C. Scott Calibrated asymmetric surrogate losses , 2012 .

[21]  Fabio Roli,et al.  Threshold optimisation for multi-label classifiers , 2013, Pattern Recognit..

[22]  Eyke Hüllermeier,et al.  Optimizing the F-Measure in Multi-Label Classification: Plug-in Rule Approach versus Structured Loss Minimization , 2013, ICML.

[23]  Yue Wang,et al.  The Genia Event Extraction Shared Task, 2013 Edition - Overview , 2013, BioNLP@ACL.

[24]  Charles Elkan,et al.  Optimal Thresholding of Classifiers to Maximize F1 Measure , 2014, ECML/PKDD.