Receiver operating characteristic analysis: a primer.

This issue of Academic Radiology is the second of two issues honoring the memory of Dr. Charles E.Metz, who pioneered the application of receiver operating characteristic (ROC) analysis for evaluating the diagnostic performance of radiology exams (1). Along with 10 articles that appeared in the December 2012 issue of Academic Radiology (2–11), the 15 articles featured in this issue represent some of the latest research in ROC analysis. Together, the 25 articles are a fitting tribute to the groundwork laid by Dr. Metz. If you examine the references cited in these articles, you will discover that a number of Dr. Metz’s seminal articles in ROC analysis have appeared in the pages ofAcademic Radiology. ROC analysis can be complex, and it is filled with methodological nuances. To aid the reader, I have been asked by the Editor to place each of the articles in a context that is accessible to the practicing academic radiologist, as I did for the first Metz memorial issue (12). The classic ROC curve represents an intrinsic property of a diagnostic test and plots the continuous tradeoff between sensitivity and specificity of the test. In practice, a radiologist chooses, consciously or subconsciously, one sensitivity/specificity point on the ROC curve at which to operate. But which operating point is optimum? The ROC curve by itself cannot answer this question. Defining the optimum operating point requires additional information, mainly the prevalence of the disease being diagnosed and the relative clinical value of the possible test outcomes (true positive, false positive, true negative, and false negative). This additional information also defines the overall clinical value of the diagnostic test when the optimum operating point on the ROC curve is used. Clinically speaking, the ROC curve by itself says little about the value of the diagnostic test. Determining the relative clinical value of the possible diagnostic test outcomes involves utility theory (13,14) and is beyond the scope of this editorial. Such work can be

[1]  R. Swensson Unified measurement of observer performance in detecting and localizing target objects on images. , 1996, Medical physics.

[2]  Xiao-Hua Zhou,et al.  Statistical Methods in Diagnostic Medicine , 2002 .

[3]  A. Toledano,et al.  Ordinal regression methodology for ROC curves derived from correlated data. , 1996, Statistics in medicine.

[4]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[5]  C E Metz,et al.  Some practical issues of experimental design and data analysis in radiological ROC studies. , 1989, Investigative radiology.

[6]  Lori E. Dodd,et al.  Partial AUC Estimation and Regression , 2003, Biometrics.

[7]  D. Dorfman,et al.  Maximum-likelihood estimation of parameters of signal-detection theory and determination of confidence intervals—Rating-method data , 1969 .

[8]  C. Metz ROC Methodology in Radiologic Imaging , 1986, Investigative radiology.

[9]  R G Swensson,et al.  Using Localization Data from Image Interpretations to Improve Estimates of Performance Accuracy , 2000, Medical decision making : an international journal of the Society for Medical Decision Making.

[10]  B. McNeil,et al.  Assessment of radiologic tests: control of bias and other design considerations. , 1988, Radiology.

[11]  M. Pepe An Interpretation for the ROC Curve and Inference Using GLM Procedures , 2000, Biometrics.

[12]  M. Pepe The Statistical Evaluation of Medical Tests for Classification and Prediction , 2003 .

[13]  A. Dwyer,et al.  In pursuit of a piece of the ROC. , 1996, Radiology.

[14]  C. Metz,et al.  Visual detection and localization of radiographic images. , 1975, Radiology.

[15]  R. F. Wagner,et al.  Multireader, multicase receiver operating characteristic analysis: an empirical comparison of five methods. , 2004, Academic radiology.

[16]  D. McClish Analyzing a Portion of the ROC Curve , 1989, Medical decision making : an international journal of the Society for Medical Decision Making.

[17]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[18]  H. H. Song,et al.  Analysis of correlated ROC areas in diagnostic testing. , 1997, Biometrics.

[19]  A. Toledano Three methods for analysing correlated ROC curves: a comparison in real data sets from multi‐reader, multi‐case studies with a factorial design , 2003, Statistics in medicine.

[20]  Nancy A Obuchowski,et al.  Special Topics III: bias. , 2003, Radiology.

[21]  C. Metz,et al.  Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously-distributed data. , 1998, Statistics in medicine.

[22]  P. Langenberg,et al.  Breast Imaging Reporting and Data System: inter- and intraobserver variability in feature analysis and final assessment. , 2000, AJR. American journal of roentgenology.

[23]  D. Bostwick,et al.  Staging of prostate cancer. , 1994, Seminars in surgical oncology.

[24]  N. Obuchowski,et al.  Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests: An anova approach with dependent observations , 1995 .

[25]  H. Ishwaran,et al.  A general class of hierarchical ordinal regression models with applications to correlated roc analysis , 2000 .

[26]  L B Lusted,et al.  Signal detectability and medical decision-making. , 1971, Science.

[27]  C. Metz,et al.  A receiver operating characteristic partial area index for highly sensitive diagnostic tests. , 1996, Radiology.

[28]  John F. Hamilton,et al.  A Free Response Approach To The Measurement And Characterization Of Radiographic Observer Performance , 1977, Other Conferences.

[29]  K. Berbaum,et al.  Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. , 1992, Investigative radiology.

[30]  X H Zhou,et al.  Empirical Bayes Combination of Estimated Areas under ROC Curves Using Estimating Equations , 1996, Medical decision making : an international journal of the Society for Medical Decision Making.

[31]  Klaus Jung,et al.  Comparison of eight computer programs for receiver-operating characteristic analysis. , 2003, Clinical chemistry.

[32]  C A Roe,et al.  Dorfman-Berbaum-Metz method for statistical analysis of multireader, multimodality receiver operating characteristic data: validation with computer simulation. , 1997, Academic radiology.

[33]  R. F. Wagner,et al.  Components-of-variance models and multiple-bootstrap experiments: an alternative method for random-effects, receiver operating characteristic analysis. , 2000, Academic radiology.

[34]  C W Piccoli,et al.  Staging of prostate cancer: results of Radiology Diagnostic Oncology Group project comparison of three MR imaging techniques. , 1994, Radiology.