The relationship between Precision-Recall and ROC curves

Receiver Operator Characteristic (ROC) curves are commonly used to present results for binary decision problems in machine learning. However, when dealing with highly skewed datasets, Precision-Recall (PR) curves give a more informative picture of an algorithm's performance. We show that a deep connection exists between ROC space and PR space, such that a curve dominates in ROC space if and only if it dominates in PR space. A corollary is the notion of an achievable PR curve, which has properties much like the convex hull in ROC space; we show an efficient algorithm for computing this curve. Finally, we also note differences in the two types of curves are significant for algorithm design. For example, in PR space it is incorrect to linearly interpolate between points. Furthermore, algorithms that optimize the area under the ROC curve are not guaranteed to optimize the area under the PR curve.

[1]  Vijay V. Raghavan,et al.  A critical investigation of recall and precision as measures of retrieval system performance , 1989, TOIS.

[2]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[3]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[4]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[5]  Robert C. Holte,et al.  Explicitly representing expected cost: an alternative to ROC representation , 2000, KDD '00.

[6]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[7]  Peter A. Flach,et al.  Learning Decision Trees Using the Area Under the ROC Curve , 2002, ICML.

[8]  Michael C. Mozer,et al.  Optimizing Classifier Performance Via the Wilcoxon-Mann-Whitney Statistic , 2003, ICML 2003.

[9]  Mehryar Mohri,et al.  AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[10]  Robert C. Holte,et al.  What ROC Curves Can't Do (and Cost Curves Can) , 2004, ROCAI.

[11]  Bhavani Raskutti,et al.  Optimising area under the ROC curve using gradient descent , 2004, ICML.

[12]  Mark Craven,et al.  Markov Networks for Detecting Overalpping Elements in Sequence Data , 2004, NIPS.

[13]  Jude W. Shavlik,et al.  Learning Ensembles of First-Order Clauses for Recall-Precision Curves: A Case Study in Biomedical Information Extraction , 2004, ILP.

[14]  Rohit J. Kate,et al.  Comparative experiments on learning information extractors for proteins and their interactions , 2005, Artif. Intell. Medicine.

[15]  Pedro M. Domingos,et al.  Learning the structure of Markov logic networks , 2005, ICML.

[16]  Foster Provost,et al.  Suspicion scoring of networked entities based on guilt-by-association, collective inference, and focused data access 1 , 2005 .

[17]  Peter A. Flach,et al.  ROCCER: An Algorithm for Rule Learning Based on ROC Analysis , 2005, IJCAI.

[18]  Pedro M. Domingos,et al.  Discriminative Training of Markov Logic Networks , 2005, AAAI.

[19]  Foster Provost,et al.  Suspicion scoring based on guilt-by-association, colle ctive inference, and focused data access 1 , 2005 .

[20]  Jesse Davis,et al.  View Learning for Statistical Relational Learning: With an Application to Mammography , 2005, IJCAI.

[21]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[22]  J. Shavlik,et al.  Learning ensembles of first-order clauses that optimize precision-recall curves , 2007 .