Multireader receiver operating characteristic studies: a comparison of study designs.

RATIONALE AND OBJECTIVES Traditionally, multireader receiver operating characteristic (ROC) studies have used a "paired-case, paired-reader" design. The statistical power of such a design for inferences about the relative accuracies of the tests was assessed and compared with alternative designs. METHODS The noncentrality parameter of an F statistic was used to compute power as a function of the reader and patient sample sizes and the variability and correlation between readings. RESULTS For a fixed-power and Type I error rate, the traditional design reduces the number of verified cases required. A hybrid design, in which each reader interprets a different sample of patients, reduces the number of readers, total readings, and reading required per reader. The drawback is a substantial increase in the number of verified cases. CONCLUSION The ultimate choice of study design depends on the nature of the tests being compared, limiting resources, a priori knowledge of the magnitude of the correlations and variability and logistic complexity.

[1]  C. Metz,et al.  A New Approach for Testing the Significance of Differences Between ROC Curves Measured from Correlated Data , 1984 .

[2]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[3]  H E Rockette,et al.  Receiver operating characteristic analysis of chest image interpretation with conventional, laser-printed, and high-resolution workstation images. , 1990, Radiology.

[4]  C A Gatsonis,et al.  Regression analysis of correlated receiver operating characteristic data. , 1995, Academic radiology.

[5]  K. Berbaum,et al.  Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. , 1992, Investigative radiology.

[6]  J. Walsh Concerning the Effect of Intraclass Correlation on Certain Significance Tests , 1947 .

[7]  N A Obuchowski,et al.  Computing Sample Size for Receiver Operating Characteristic Studies , 1994, Investigative radiology.

[8]  J. Hanley,et al.  A method of comparing the areas under receiver operating characteristic curves derived from the same cases. , 1983, Radiology.

[9]  John A. Swets,et al.  Evaluation of diagnostic systems : methods from signal detection theory , 1982 .

[10]  N A Obuchowski,et al.  Multireader, multimodality receiver operating characteristic curve studies: hypothesis testing and sample size estimation using an analysis of variance approach with dependent observations. , 1995, Academic radiology.