论文信息 - Technical Note: Towards ROC Curves in Cost Space

Technical Note: Towards ROC Curves in Cost Space

AbstractROC curves and cost curves are two popular ways of visualising classiﬁer performance, ﬁnding appro-priate thresholds according to the operating condition, and deriving useful aggregated measures such asthe area under the ROC curve (AUC) or the area under the optimal cost curve. In this note we presentsome new ﬁndings and connections between ROC space and cost space, by using the expected loss overa range of operating conditions. In particular, we show that ROC curves can be transferred to cost spaceby means of a very natural way of understanding how thresholds should be chosen, by selecting thethreshold such that the proportion of positive predictions equals the operating condition (either in theform of cost proportion or skew). We call these new curves ROC cost curves, and we demonstrate thatthe expected loss as measured by the area under these curves is linearly related to AUC. This opens upa series of new possibilities and clariﬁes the notion of cost curve and its relation to ROC analysis. Inaddition, we show that for a classiﬁer that assigns the scores in an evenly-spaced way, these curves areequal to the Brier curves. As a result, this establishes the ﬁrst clear connection between AUC and theBrier score.Keywords: cost curves, ROC curves, Brier curves, classiﬁer performance measures, cost-sensitive eval-uation, operating condition, Brier score, Area Under the ROC Curve (AUC).

Peter A. Flach | José Hernández-Orallo | César Ferri

[1] Tom Fawcett,et al. An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[2] Peter A. Flach. The Geometry of ROC Space: Understanding Machine Learning Metrics through ROC Isometrics , 2003, ICML.

[3] Peter A. Flach,et al. A Coherent Interpretation of AUC as a Measure of Aggregated Classification Performance , 2011, ICML.

[4] Mark D. Reid,et al. Information, Divergence and Risk for Binary Experiments , 2009, J. Mach. Learn. Res..

[5] Niall M. Adams,et al. Comparing classifiers when the misallocation costs are uncertain , 1999, Pattern Recognit..

[6] Peter A. Flach,et al. Brier Curves: a New Cost-Based Visualisation of Classifier Performance , 2011, ICML.

[7] David J. Hand,et al. Measuring classifier performance: a coherent alternative to the area under the ROC curve , 2009, Machine Learning.

[8] Gregory Piatetsky-Shapiro,et al. Estimating campaign benefits and modeling lift , 1999, KDD '99.

[9] Alvin F. Martin,et al. The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[10] J A Swets,et al. Better decisions through science. , 2000, Scientific American.

[11] Robert C. Holte,et al. Cost curves: An improved method for visualizing classifier performance , 2006, Machine Learning.

[12] Robert C. Holte,et al. Explicitly representing expected cost: an alternative to ROC representation , 2000, KDD '00.

[13] Moisés Goldszmidt,et al. Properties and Benefits of Calibrated Classifiers , 2004, PKDD.

[14] A. H. Murphy,et al. A Note on the Utility of Probabilistic Predictions and the Probability Score in the Cost-Loss Ratio Decision Situation , 1966 .

[15] G. Brier. VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[16] Charles Elkan,et al. The Foundations of Cost-Sensitive Learning , 2001, IJCAI.