Evaluating imaging systems in the absence of truth: a comparison of ROC and mixture distribution analysis in computer-aided diagnosis in mammography

A rigorous ROC analysis requires independent proof of cases that cannot be obtained in many practical clinical situations. An alternative is measurement of diagnostic reliability as relative percent agreement (RPA). A large set of mammograms read with and without computer aided diagnosis (CAD) was used to compare the ROC area (Az) using proof and RPA using agreement. A subset of 767 (416 normal and 351 abnormal) cases read by 5 generalists and 5 mammographers was selected from the readings on 900 proved mammograms read with and without CAD. The Az was calculated using the multireader-multicase (MRMC) method. The RPA was calculated from a mixture distribution analysis (MDA) using the EM algorithm. Individual reader values were calculated by a jackknife procedure. With and without CAD the Az was .90 and .88 for the mammographers (p equals .04) and .87 and .86 for the generalists (p equals .5). The RPA was 90 and 83 for mammographers (p equals .08) and 82 and 85 for generalists (p equals .3). The correlation between Az and RPA was 0.6. The larger variance of the RPA decreases statistical power. The MDA may be a useful method for comparing imaging modalities in clinical studies where definitive proof cannot be obtained.