Evaluating Protein-protein Interaction Predictors with a Novel 3-Dimensional Metric

In order for the predicted interactions to be directly adopted by biologists, the machine learning predictions have to be of high precision, regardless of recall. This aspect cannot be evaluated or numerically represented well by traditional metrics like accuracy, ROC, or precision-recall curve. In this work, we start from the alignment in sensitivity of ROC and recall of precision-recall curve, and propose an evaluation metric focusing on the ability of a model to be adopted by biologists. This metric evaluates the ability of a machine learning algorithm to predict only new interactions, meanwhile, it eliminates the influence of test dataset. In the experiment of evaluating different classifiers with a same data set and evaluating the same predictor with different datasets, our new metric fulfills the evaluation task of our interest while two widely recognized metrics, ROC and precision-recall curve fail the tasks for different reasons. 1

[1]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[2]  N. Adams,et al.  Measuring classification performance : the hmeasure package , 2012 .

[3]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[4]  Robert C. Holte,et al.  Cost curves: An improved method for visualizing classifier performance , 2006, Machine Learning.

[5]  M. Banerjee,et al.  Beyond kappa: A review of interrater agreement measures , 1999 .

[6]  Stephen V. Stehman,et al.  Selecting and interpreting measures of thematic classification accuracy , 1997 .

[7]  David J. Hand,et al.  Measuring classifier performance: a coherent alternative to the area under the ROC curve , 2009, Machine Learning.

[8]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[9]  Robert C. Holte,et al.  What ROC Curves Can't Do (and Cost Curves Can) , 2004, ROCAI.

[10]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[11]  David Page,et al.  Area under the Precision-Recall Curve: Point Estimates and Confidence Intervals , 2013, ECML/PKDD.

[12]  Peter A. Flach,et al.  A Coherent Interpretation of AUC as a Measure of Aggregated Classification Performance , 2011, ICML.

[13]  David J Hand,et al.  Evaluating diagnostic tests: The area under the ROC curve and the balance of errors , 2010, Statistics in medicine.