A Framework for Random-Effects ROC Analysis: Biases with the Bootstrap and Other Variance Estimators

In this article, we analyze the three-way bootstrap estimate of the variance of the reader-averaged nonparametric area under the receiver operating characteristic (ROC) curve. The setting for this work is medical imaging, and the experimental design involves sampling from three distributions: a set of normal and diseased cases (patients), and a set of readers (doctors). The experiment we consider is fully crossed in that each reader reads each case. A reading generates a score that indicates the reader's level of suspicion that the patient is diseased. The distribution of scores for the normal patients is compared to the distribution of scores for the diseased patients via an ROC curve, and the area under the ROC curve (AUC) summarizes the reader's diagnostic ability to separate the normal patients from the diseased ones. We find that the bootstrap estimate of the variance of the reader-averaged AUC is biased, and we represent this bias in terms of moments of success outcomes. This representation helps unify and improve several current methods for multi-reader multi-case (MRMC) ROC analysis.

[1]  D. Dorfman,et al.  Maximum-likelihood estimation of parameters of signal-detection theory and determination of confidence intervals—Rating-method data , 1969 .

[2]  R. Randles,et al.  Introduction to the Theory of Nonparametric Statistics , 1991 .

[3]  John A. Swets,et al.  Evaluation of diagnostic systems : methods from signal detection theory , 1982 .

[4]  B. Efron,et al.  The Jackknife: The Bootstrap and Other Resampling Plans. , 1983 .

[5]  B. Efron The jackknife, the bootstrap, and other resampling plans , 1987 .

[6]  K. Berbaum,et al.  Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. , 1992, Investigative radiology.

[7]  N A Obuchowski,et al.  Multireader, multimodality receiver operating characteristic curve studies: hypothesis testing and sample size estimation using an analysis of variance approach with dependent observations. , 1995, Academic radiology.

[8]  N. Obuchowski,et al.  Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests: An anova approach with dependent observations , 1995 .

[9]  K S Berbaum,et al.  Multireader, multicase receiver operating characteristic methodology: a bootstrap analysis. , 1995, Academic radiology.

[10]  N A Obuchowski,et al.  Multireader receiver operating characteristic studies: a comparison of study designs. , 1995, Academic radiology.

[11]  C A Roe,et al.  Dorfman-Berbaum-Metz method for statistical analysis of multireader, multimodality receiver operating characteristic data: validation with computer simulation. , 1997, Academic radiology.

[12]  C E Metz,et al.  Variance-component modeling in the analysis of receiver operating characteristic index estimates. , 1997, Academic radiology.

[13]  M. Giger,et al.  Improving breast cancer diagnosis with computer-aided diagnosis. , 1999, Academic radiology.

[14]  Debashis Kushary,et al.  Bootstrap Methods and Their Application , 2000, Technometrics.

[15]  R. F. Wagner,et al.  Components-of-variance models and multiple-bootstrap experiments: an alternative method for random-effects, receiver operating characteristic analysis. , 2000, Academic radiology.

[16]  R. F. Wagner,et al.  Components-of-variance models for random-effects ROC analysis: the case of unequal variance structures across modalities. , 2001, Academic radiology.

[17]  R. F. Wagner,et al.  Multireader, multicase receiver operating characteristic analysis: an empirical comparison of five methods. , 2004, Academic radiology.

[18]  Xiao Song,et al.  A marginal model approach for analysis of multi-reader multi-test receiver operating characteristic (ROC) data. , 2005, Biostatistics.

[19]  Matthew A. Kupinski,et al.  Probabilistic foundations of the MRMC method , 2005, SPIE Medical Imaging.

[20]  Nancy A Obuchowski,et al.  A comparison of the Dorfman–Berbaum–Metz and Obuchowski–Rockette methods for receiver operating characteristic (ROC) data , 2005, Statistics in medicine.

[21]  Eric Clarkson,et al.  A probabilistic model for the MRMC method, part 1: theoretical development. , 2006, Academic radiology.

[22]  Brandon D Gallas,et al.  One-shot estimate of MRMC variance: AUC. , 2006, Academic radiology.

[23]  Eric Clarkson,et al.  A probabilistic model for the MRMC method, part 2: validation and applications. , 2006, Academic radiology.

[24]  Kyle J Myers,et al.  Multireader multicase variance analysis for binary data. , 2007, Journal of the Optical Society of America. A, Optics, image science, and vision.

[25]  Andriy I. Bandos,et al.  Exact Bootstrap Variances of the Area Under ROC Curve , 2007 .

[26]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[27]  S. Hillis A comparison of denominator degrees of freedom methods for multiple observer ROC analysis , 2007, Statistics in medicine.

[28]  David G. Brown,et al.  Reader studies for validation of CAD systems , 2008, Neural Networks.