Effects of the continuous and discrete confidence rating scales in ROC observer studies

We previously conducted an observer study evaluating radiologists' performance for characterization of mammographic masses on serial mammograms with and without CAD. 253 temporal image pairs (138 malignant and 115 benign) from 96 patients containing masses on serial mammograms were used. The interval change characteristics of the masses on each temporal pair were analyzed by our CAD program to differentiate malignant and benign masses. The classifier achieved a test Az value of 0.87 for the data set. Eight MQSA radiologists and 2 fellows assessed the temporal masses and provided estimates of the likelihood of malignancy (LM) and BI-RADS assessment without and then with CAD. The LM estimates were provided on a quasi-continuous confidence-rating scale (CRS) of 1 to 100. In the current study we investigated the effects of using discrete CRS with fewer categories on ROC analysis. We simulated three discrete CRSs containing 5, 10, and 20 categories by binning the radiologists’ LM quasi-continuous ratings. For the ten radiologists, without CAD, the average Az in estimating the LM for the 5, 10, 20 and 100 category CRSs were 0.788, 0.786, 0.785, and 0.787, respectively. With CAD, the observers' Az improved to 0.845, 0.843, 0.844, and 0.843, respectively. The improvement was statistically significant (p<0.011) for each CRS. The partial area index for the four CRSs without CAD was 0.198, 0.204, 0.200, and 0.206, respectively. With CAD the partial area index was also significantly improved to 0.369, 0.365, 0.369, and 0.366, respectively (p<0.006 for all CRSs). The use of continuous and discrete confidence-rating scales in this study had minimal effect on the analysis of observer performance.

[1]  N. Petrick,et al.  Improvement in radiologists' characterization of malignant and benign breast masses on serial mammograms with computer-aided diagnosis: an ROC study. , 2004, Radiology.

[2]  Berkman Sahiner,et al.  ROC study: effects of computer-aided diagnosis on radiologists' characterization of malignant and benign breast masses in temporal pairs of mammograms , 2003, SPIE Medical Imaging.

[3]  Dd Dorfman,et al.  ROC rating analysis : generalization to the population of readers and cases with the jackknife method , 1992 .

[4]  H E Rockette,et al.  The use of continuous and discrete confidence judgments in receiver operating characteristic studies of diagnostic imaging techniques. , 1992, Investigative radiology.

[5]  C. Metz,et al.  A receiver operating characteristic partial area index for highly sensitive diagnostic tests. , 1996, Radiology.

[6]  Kevin S Berbaum,et al.  An empirical comparison of discrete ratings and subjective probability ratings. , 2002, Academic radiology.

[7]  R. F. Wagner,et al.  Continuous versus categorical data for ROC analysis: some quantitative considerations. , 2001, Academic radiology.

[8]  Lubomir M. Hadjiiski,et al.  Analysis of temporal changes of mammographic features: computer-aided classification of malignant and benign breast masses. , 2001, Medical physics.

[9]  H E Rockette,et al.  On the validity of the continuous and discrete confidence rating scales in receiver operating characteristic studies. , 1993, Investigative radiology.