论文信息 - From Cost-Sensitive Classification to Tight F-measure Bounds

From Cost-Sensitive Classification to Tight F-measure Bounds

The F-measure is a classification performance measure, especially suited when dealing with imbalanced datasets, which provides a compromise between the precision and the recall of a classifier. As this measure is non convex and non linear, it is often indirectly optimized using cost-sensitive learning (that affects different costs to false positives and false negatives). In this article, we derive theoretical guarantees that give tight bounds on the best F-measure that can be obtained from cost-sensitive learning. We also give an original geometric interpretation of the bounds that serves as an inspiration for CONE, a new algorithm to optimize for the F-measure. Using 10 datasets exhibiting varied class imbalance, we illustrate that our bounds are much tighter than previous work and show that CONE learns models with either superior F-measures than existing methods or comparable but in fewer iterations.

[1] Yves Grandvalet,et al. Optimizing F-Measures by Cost-Sensitive Classification , 2014, NIPS.

[2] Oluwasanmi Koyejo,et al. Consistent Binary Classification with Generalized Performance Metrics , 2014, NIPS.

[3] Harikrishna Narasimhan,et al. Consistent Multiclass Algorithms for Complex Performance Measures , 2015, ICML.