Bootstrapped MRMC confidence intervals

The multiple-reader, multiple-case (MRMC) paradigm of Swets and Pickett (1982) for ROC analysis was expressed as a components of variance model by Dorfman, Berbaum, and Metz (1992) and validated by Roe and Metz (1997) for Type I error rates. Our group proposed an analysis of the MRMC components of variance model using bootstrap (Beiden, Wagner, and Campbell, 2000) experiments instead of jackknife pseudo-values. These approaches have been challenged by some contemporary authors (e.g. Zhou, Obuchowski, and McClish, 2002). The purpose of the present paper is to formally compare the models and to carry out validation tests of their performance. We investigate different approaches to statistical inference, including several types of nonparametric bootstrap confidence intervals and report on validation and simulation experiments of Type I errors.

[1]  N. Obuchowski,et al.  Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests: An anova approach with dependent observations , 1995 .

[2]  R. F. Wagner,et al.  Components-of-variance models for random-effects ROC analysis: the case of unequal variance structures across modalities. , 2001, Academic radiology.

[3]  C A Roe,et al.  Dorfman-Berbaum-Metz method for statistical analysis of multireader, multimodality receiver operating characteristic data: validation with computer simulation. , 1997, Academic radiology.

[4]  R F Wagner,et al.  Analysis of uncertainties in estimates of components of variance in multivariate ROC analysis. , 2001, Academic radiology.

[5]  Bradley Efron,et al.  Bootstrap Condence Intervals , 1996 .

[6]  Anthony C. Davison,et al.  Bootstrap Methods and Their Application , 1998 .

[7]  B. Efron,et al.  Bootstrap confidence intervals , 1996 .

[8]  K. Berbaum,et al.  Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. , 1992, Investigative radiology.

[9]  R. F. Wagner,et al.  Multireader, multicase receiver operating characteristic analysis: an empirical comparison of five methods. , 2004, Academic radiology.

[10]  K S Berbaum,et al.  Monte Carlo validation of a multireader method for receiver operating characteristic discrete rating data: factorial experimental design. , 1998, Academic radiology.

[11]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[12]  N A Obuchowski,et al.  Multireader, multimodality receiver operating characteristic curve studies: hypothesis testing and sample size estimation using an analysis of variance approach with dependent observations. , 1995, Academic radiology.