Exact Bootstrap Variances of the Area Under ROC Curve

The area under the Receiver Operating Characteristic (ROC) curve (AUC) and related summary indices are widely used for assessment of accuracy of an individual and comparison of performances of several diagnostic systems in many areas including studies of human perception, decision making, and the regulatory approval process for new diagnostic technologies. Many investigators have suggested implementing the bootstrap approach to estimate variability of AUC-based indices. Corresponding bootstrap quantities are typically estimated by sampling a bootstrap distribution. Such a process, frequently termed Monte Carlo bootstrap, is often computationally burdensome and imposes an additional sampling error on the resulting estimates. In this article, we demonstrate that the exact or ideal (sampling error free) bootstrap variances of the nonparametric estimator of AUC can be computed directly, i.e., avoiding resampling of the original data, and we develop easy-to-use formulas to compute them. We derive the formulas for the variances of the AUC corresponding to a single given or random reader, and to the average over several given or randomly selected readers. The derived formulas provide an algorithm for computing the ideal bootstrap variances exactly and hence improve many bootstrap methods proposed earlier for analyzing AUCs by eliminating the sampling error and sometimes burdensome computations associated with a Monte Carlo (MC) approximation. In addition, the availability of closed-form solutions provides the potential for an analytical assessment of the properties of bootstrap variance estimators. Applications of the proposed method are shown on two experimentally ascertained datasets that illustrate settings commonly encountered in diagnostic imaging. In the context of the two examples we also demonstrate the magnitude of the effect of the sampling error of the MC estimators on the resulting inferences.

[1]  J. Walsh Elements of Nonparametric Statistics , 1968 .

[2]  D. Dorfman,et al.  Maximum-likelihood estimation of parameters of signal-detection theory and determination of confidence intervals—Rating-method data , 1969 .

[3]  D. Bamber The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .

[4]  J. S. Maritz,et al.  A Note on Estimating the Variance of the Sample Median , 1978 .

[5]  John A. Swets,et al.  Evaluation of diagnostic systems : methods from signal detection theory , 1982 .

[6]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[7]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[8]  J. Hanley Receiver operating characteristic (ROC) methodology: the state of the art. , 1989, Critical reviews in diagnostic imaging.

[9]  Mitchell H. Gail,et al.  A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data , 1989 .

[10]  C E Metz,et al.  Some practical issues of experimental design and data analysis in radiological ROC studies. , 1989, Investigative radiology.

[11]  C A Beam,et al.  Strategies for improving power in diagnostic radiology research. , 1992, AJR. American journal of roentgenology.

[12]  K. Berbaum,et al.  Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. , 1992, Investigative radiology.

[13]  C A Britton,et al.  Digital radiography and conventional imaging of the chest: a comparison of observer performance. , 1994, AJR. American journal of roentgenology.

[14]  D Mossman,et al.  Resampling Techniques in the Analysis of Non-binormal ROC Data , 1995, Medical decision making : an international journal of the Society for Medical Decision Making.

[15]  N. Obuchowski,et al.  Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests: An anova approach with dependent observations , 1995 .

[16]  K S Berbaum,et al.  Multireader, multicase receiver operating characteristic methodology: a bootstrap analysis. , 1995, Academic radiology.

[17]  N A Obuchowski,et al.  Multireader receiver operating characteristic studies: a comparison of study designs. , 1995, Academic radiology.

[18]  E. S. Venkatraman,et al.  A distribution-free procedure for comparing receiver operating characteristic curves from a paired experiment , 1996 .

[19]  H. H. Song,et al.  Analysis of correlated ROC areas in diagnostic testing. , 1997, Biometrics.

[20]  C E Metz,et al.  Variance-component modeling in the analysis of receiver operating characteristic index estimates. , 1997, Academic radiology.

[21]  N A Obuchowski,et al.  Confidence intervals for the receiver operating characteristic area in studies with small samples. , 1998, Academic radiology.

[22]  C. Metz,et al.  Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously-distributed data. , 1998, Statistics in medicine.

[23]  A. V. D. Vaart,et al.  Asymptotic Statistics: Frontmatter , 1998 .

[24]  H E Rockette,et al.  Empiric assessment of parameters that affect the design of multireader receiver operating characteristic studies. , 1999, Academic radiology.

[25]  S. Wieand,et al.  Comparison of diagnostic markers with repeated measurements: a non-parametric ROC curve approach. , 2000, Statistics in medicine.

[26]  Debashis Kushary,et al.  Bootstrap Methods and Their Application , 2000, Technometrics.

[27]  Alan D. Hutson,et al.  The exact bootstrap mean and variance of an L‐estimator , 2000 .

[28]  R. F. Wagner,et al.  Components-of-variance models and multiple-bootstrap experiments: an alternative method for random-effects, receiver operating characteristic analysis. , 2000, Academic radiology.

[29]  H. Ishwaran,et al.  A general class of hierarchical ordinal regression models with applications to correlated roc analysis , 2000 .

[30]  C. Rutter,et al.  Bootstrap estimation of diagnostic accuracy with patient-clustered data. , 2000, Academic radiology.

[31]  H E Rockette,et al.  Effects of luminance and resolution on observer performance with chest radiographs. , 2000, Radiology.

[32]  Xiao-Hua Zhou,et al.  Statistical Methods in Diagnostic Medicine , 2002 .

[33]  R. F. Wagner,et al.  Assessment of medical imaging and computer-assist systems: lessons from recent experience. , 2002, Academic radiology.

[34]  M. Pepe The Statistical Evaluation of Medical Tests for Classification and Prediction , 2003 .

[35]  Lori E. Dodd,et al.  Partial AUC Estimation and Regression , 2003, Biometrics.

[36]  Lori E. Dodd,et al.  Semiparametric Regression for the Area Under the Receiver Operating Characteristic Curve , 2003 .

[37]  R. F. Wagner,et al.  Multireader, multicase receiver operating characteristic analysis: an empirical comparison of five methods. , 2004, Academic radiology.

[38]  C. Yiannoutsos,et al.  Ordered multiple‐class ROC analysis with continuous measurements , 2004, Statistics in medicine.

[39]  David Gur,et al.  A permutation test sensitive to differences in areas for comparing ROC curves from a paired design , 2005, Statistics in medicine.

[40]  Lucila Ohno-Machado,et al.  A global goodness-of-fit test for receiver operating characteristic curve analysis via the bootstrap method , 2005, J. Biomed. Informatics.

[41]  Andriy Bandos,et al.  NONPARAMETRIC METHODS IN COMPARING TWO CORRELATED ROC CURVES , 2005 .

[42]  David Gur,et al.  Variability in observer performance studies experimental observations. , 2005, Academic radiology.

[43]  Nancy A Obuchowski,et al.  A comparison of the Dorfman–Berbaum–Metz and Obuchowski–Rockette methods for receiver operating characteristic (ROC) data , 2005, Statistics in medicine.

[44]  David Gur,et al.  A permutation test for comparing ROC curves in multireader studies a multi-reader ROC, permutation test. , 2006, Academic radiology.

[45]  Brandon D Gallas,et al.  One-shot estimate of MRMC variance: AUC. , 2006, Academic radiology.

[46]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .