The use of receiver operating characteristic curves in biomedical informatics

Receiver operating characteristic (ROC) curves are frequently used in biomedical informatics research to evaluate classification and prediction models for decision support, diagnosis, and prognosis. ROC analysis investigates the accuracy of a model's ability to separate positive from negative cases (such as predicting the presence or absence of disease), and the results are independent of the prevalence of positive cases in the study population. It is especially useful in evaluating predictive models or other tests that produce output values over a continuous range, since it captures the trade-off between sensitivity and specificity over that range. There are many ways to conduct an ROC analysis. The best approach depends on the experiment; an inappropriate approach can easily lead to incorrect conclusions. In this article, we review the basic concepts of ROC analysis, illustrate their use with sample calculations, make recommendations drawn from the literature, and list readily available software.

[1]  N A Obuchowski,et al.  Nonparametric analysis of clustered ROC curve data. , 1997, Biometrics.

[2]  N. Obuchowski Receiver operating characteristic curves and their use in radiology. , 2003, Radiology.

[3]  K. Berbaum,et al.  Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. , 1992, Investigative radiology.

[4]  C. Metz,et al.  Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously-distributed data. , 1998, Statistics in medicine.

[5]  S Vida,et al.  A computer program for non-parametric receiver operating characteristic analysis. , 1993, Computer methods and programs in biomedicine.

[6]  J A Hanley,et al.  A Comparison of Parametric and Nonparametric Approaches to ROC Analysis of Quantitative Diagnostic Tests , 1997, Medical decision making : an international journal of the Society for Medical Decision Making.

[7]  Elizabeth R DeLong,et al.  ROC methodology within a monitoring framework , 2003, Statistics in medicine.

[8]  J. Hanley,et al.  A method of comparing the areas under receiver operating characteristic curves derived from the same cases. , 1983, Radiology.

[9]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[10]  K. Zou,et al.  Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests. , 1997, Statistics in medicine.

[11]  William J. Long,et al.  Research Paper: Evaluation of a Cardiac Diagnostic Program in a Typical Clinical Setting , 2003, J. Am. Medical Informatics Assoc..

[12]  Mitchell H. Gail,et al.  A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data , 1989 .

[13]  Chris Lloyd,et al.  Using Smoothed Receiver Operating Characteristic Curves to Summarize and Compare Diagnostic Systems , 1998 .

[14]  Nancy A Obuchowski,et al.  Confidence bounds when the estimated ROC area is 1.01. , 2002, Academic radiology.

[15]  D. Bamber The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .

[16]  C E Metz,et al.  Some practical issues of experimental design and data analysis in radiological ROC studies. , 1989, Investigative radiology.

[17]  F Schoonjans,et al.  MedCalc: a new computer program for medical statistics. , 1995, Computer methods and programs in biomedicine.

[18]  V Kairisto,et al.  Software for illustrative presentation of basic clinical characteristics of laboratory tests--GraphROC for Windows. , 1995, Scandinavian journal of clinical and laboratory investigation. Supplementum.

[19]  N. Obuchowski,et al.  ROC curves in clinical chemistry: uses, misuses, and possible solutions. , 2004, Clinical chemistry.

[20]  James A Hanley,et al.  Comparison of three methods for estimating the standard error of the area under the curve in ROC analysis of quantitative data. , 2002, Academic radiology.

[21]  Jean L Freeman,et al.  A non-parametric method for the comparison of partial areas under ROC curves and its application to large health care data sets. , 2002, Statistics in medicine.

[22]  B. Reiser,et al.  Estimation of the area under the ROC curve , 2002, Statistics in medicine.

[23]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[24]  E. S. Venkatraman,et al.  A distribution-free procedure for comparing receiver operating characteristic curves from a paired experiment , 1996 .

[25]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[26]  N A Obuchowski,et al.  Confidence intervals for the receiver operating characteristic area in studies with small samples. , 1998, Academic radiology.

[27]  M. Pepe The Statistical Evaluation of Medical Tests for Classification and Prediction , 2003 .

[28]  C A Roe,et al.  Statistical Comparison of Two ROC-curve Estimates Obtained from Partially-paired Datasets , 1998, Medical decision making : an international journal of the Society for Medical Decision Making.

[29]  M. Binder,et al.  Comparing Three-class Diagnostic Tests by Three-way ROC Analysis , 2000, Medical decision making : an international journal of the Society for Medical Decision Making.

[30]  J A Hanley,et al.  Sampling variability of nonparametric estimates of the areas under receiver operating characteristic curves: an update. , 1997, Academic radiology.

[31]  F. Buntinx,et al.  Meta-analysis of ROC Curves , 2000, Medical decision making : an international journal of the Society for Medical Decision Making.

[32]  Nancy A Obuchowski,et al.  Determining sample size for ROC studies: what is reasonable for the expected difference in tests' ROC areas? , 2003, Academic radiology.

[33]  D. Mossman Three-way ROCs , 1999, Medical decision making : an international journal of the Society for Medical Decision Making.

[34]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[35]  Xiao-Hua Zhou,et al.  Statistical Methods in Diagnostic Medicine , 2002 .

[36]  Klaus Jung,et al.  Comparison of eight computer programs for receiver-operating characteristic analysis. , 2003, Clinical chemistry.

[37]  P. Bossuyt,et al.  Sources of Variation and Bias in Studies of Diagnostic Accuracy , 2004, Annals of Internal Medicine.

[38]  Hong Tang,et al.  Data mining techniques for cancer detection using serum proteomic profiling , 2004, Artif. Intell. Medicine.

[39]  C. Metz ROC Methodology in Radiologic Imaging , 1986, Investigative radiology.

[40]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[41]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[42]  L. Ohno-Machado Journal of Biomedical Informatics , 2001 .

[43]  Peter J. Haug,et al.  A Comparison of Classification Algorithms to Automatically Identify Chest X-Ray Reports That Support Pneumonia , 2001, J. Biomed. Informatics.

[44]  Gregory F Cooper,et al.  Research Paper: Creating a Text Classifier to Detect Radiology Reports Describing Mediastinal Findings Associated with Inhalational Anthrax and Other Disorders , 2003, J. Am. Medical Informatics Assoc..

[45]  Yindalon Aphinyanagphongs,et al.  Text Categorization Models for Retrieval of High Quality Articles in Internal Medicine , 2003, AMIA.

[46]  D. McClish Analyzing a Portion of the ROC Curve , 1989, Medical decision making : an international journal of the Society for Medical Decision Making.

[47]  N A Obuchowski,et al.  Computing Sample Size for Receiver Operating Characteristic Studies , 1994, Investigative radiology.

[48]  P. Robinson The interpretation of diagnostic tests. , 1987, Nuclear medicine communications.

[49]  Sabine Van Huffel,et al.  Preoperative prediction of malignancy of ovarian tumors using least squares support vector machines , 2003, Artif. Intell. Medicine.