Monte Carlo validation of the Dorfman-Berbaum-Metz method using normalized pseudovalues and less data-based model simplification.

RATIONALE AND OBJECTIVES Two problems of the Dorfman-Berbaum-Metz (DBM) method for analyzing multireader receiver operating characteristic (ROC) studies are that it tends to be conservative and that it can produce AUC estimates outside the parameter space--ie, greater than one or less than zero. Recently it has been shown that the problem of AUC (or other accuracy) estimates outside the parameter space can be eliminated by using normalized pseudovalues, and it has been suggested that less data-based model simplification be used. Our purpose is to empirically investigate if these two modifications--normalized pseudovalues and less data-based model simplification--result in improved performance. MATERIALS AND METHODS We examine the performance of the DBM procedure using the two proposed modifications for discrete and continuous ratings in a null simulation study comparing modalities with respect to the ROC area. The simulation study includes 144 different combinations of reader and case sample sizes, normal/abnormal case sample ratios, and variance components. The ROC area is estimated using parametric and nonparametric estimation. RESULTS The DBM procedure with both modifications performs better than either the original DBM procedure or the DBM procedure with only one of the modifications. For parametric estimation with discrete rating data, use of both modifications resulted in the mean type I error (0.043) closest to the nominal .05 level and the smallest range (0.050) and standard deviation (0.0108) across the 144 type I error rates. CONCLUSIONS We recommend that normalized pseudovalues and less data-based model simplification be used with the DBM procedure.

[1]  C A Roe,et al.  Dorfman-Berbaum-Metz method for statistical analysis of multireader, multimodality receiver operating characteristic data: validation with computer simulation. , 1997, Academic radiology.

[2]  M. H. Quenouille Approximate Tests of Correlation in Time‐Series , 1949 .

[3]  N. Obuchowski,et al.  Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests: An anova approach with dependent observations , 1995 .

[4]  K. Berbaum,et al.  Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. , 1992, Investigative radiology.

[5]  John A. Swets,et al.  Evaluation of diagnostic systems : methods from signal detection theory , 1982 .

[6]  D. Dorfman,et al.  Maximum-likelihood estimation of parameters of signal-detection theory and determination of confidence intervals—Rating-method data , 1969 .

[7]  Nancy A Obuchowski,et al.  A comparison of the Dorfman–Berbaum–Metz and Obuchowski–Rockette methods for receiver operating characteristic (ROC) data , 2005, Statistics in medicine.

[8]  M. H. Quenouille NOTES ON BIAS IN ESTIMATION , 1956 .

[9]  K S Berbaum,et al.  Monte Carlo validation of a multireader method for receiver operating characteristic discrete rating data: factorial experimental design. , 1998, Academic radiology.

[10]  F. E. Satterthwaite Synthesis of variance , 1941 .

[11]  Satterthwaite Fe An approximate distribution of estimates of variance components. , 1946 .

[12]  Stephen L Hillis,et al.  Power estimation for the Dorfman-Berbaum-Metz method. , 2004, Academic radiology.

[13]  S C Kao,et al.  Evaluation of a digital workstation for interpreting neonatal examinations. A receiver operating characteristic study. , 1992, Investigative radiology.