Analysis of double reading in an observer study

Previously we showed based on theoretical analysis that it is possible to attain greater diagnostic performance from appropriately combining the diagnostic opinions of two or more equally skilled readers. Such gain in performance is available from combining the readers' "latent decision variables" that are accessible through ROC analysis, but is generally ambiguous at best if the readers' binary decisions with regard to clinical actions (e.g., recall vs. annual screening mammogram) are combined. We now analyze the data of an observer study. In this observer study, ten radiologists interpreted 104 cases of mammograms containing clustered microcalcifications in a diagnostic-study setting to decide whether to recommend biopsy. They also reported diagnostic confidence on a quasi-continuous scale that the calcifications indicated malignancy. A previous analysis showed that combining the radiologists' binary decisions (biopsy vs. no biopsy) would change both sensitivity and specificity generally along the radiologists' single-reading, average, ROC curve but would not increase the diagnostic performance. Combining two radiologists' "latent decision variables" resulted in small increases in the ROC curves consistent with the theoretical predictions. However, the shapes of the single-reading ROC curves were inconsistent with the expectation of the clinical diagnostic-study setting because all benign cases in the observer study were difficult-to-diagnose cases (all cases clinically biopsied). The double-reading results would have been different, and gains in diagnostic performance possible, if the ROC curve shape more accurately resembled that of clinical practice. There is need to estimate the ROC curve of clinical practice.

[1]  E. Krupinski,et al.  Anniversary paper: evaluation of medical imaging systems. , 2008, Medical physics.

[2]  C. D'Orsi,et al.  Diagnostic Performance of Digital versus Film Mammography for Breast-Cancer Screening , 2006 .

[3]  K. Berbaum,et al.  Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. , 1992, Investigative radiology.

[4]  C. Metz,et al.  Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously-distributed data. , 1998, Statistics in medicine.

[5]  E. Thurfjell,et al.  Benefit of independent double reading in a population-based mammography screening program. , 1994, Radiology.

[6]  C E Metz,et al.  Gains in Accuracy from Replicated Readings of Diagnostic Images , 1992, Medical decision making : an international journal of the Society for Medical Decision Making.

[7]  Yulei Jiang,et al.  An ROC comparison of four methods of combining information from multiple images of the same patient. , 2004, Medical physics.

[8]  M. Giger,et al.  Improving breast cancer diagnosis with computer-aided diagnosis. , 1999, Academic radiology.

[9]  R. Schmidt,et al.  Comparison of independent double readings and computer-aided diagnosis (CAD) for the diagnosis of breast calcifications. , 2006, Academic radiology.

[10]  Yulei Jiang,et al.  Effect of correlation on combining diagnostic information from two images of the same patient. , 2005, Medical physics.

[11]  C E Metz,et al.  Some practical issues of experimental design and data analysis in radiological ROC studies. , 1989, Investigative radiology.