Correlation of free-response and receiver-operating-characteristic area-under-the-curve estimates: results from independently conducted FROC∕ROC studies in mammography.

PURPOSE From independently conducted free-response receiver operating characteristic (FROC) and receiver operating characteristic (ROC) experiments, to study fixed-reader associations between three estimators: the area under the alternative FROC (AFROC) curve computed from FROC data, the area under the ROC curve computed from FROC highest rating data, and the area under the ROC curve computed from confidence-of-disease ratings. METHODS Two hundred mammograms, 100 of which were abnormal, were processed by two image-processing algorithms and interpreted by four radiologists under the FROC paradigm. From the FROC data, inferred-ROC data were derived, using the highest rating assumption. Eighteen months afterwards, the images were interpreted by the same radiologists under the conventional ROC paradigm; conventional-ROC data (in contrast to inferred-ROC data) were obtained. FROC and ROC (inferred, conventional) data were analyzed using the nonparametric area-under-the-curve (AUC), (AFROC and ROC curve, respectively). Pearson correlation was used to quantify the degree of association between the modality-specific AUC indices and standard errors were computed using the bootstrap-after-bootstrap method. The magnitude of the correlations was assessed by comparison with computed Obuchowski-Rockette fixed reader correlations. RESULTS Average Pearson correlations (with 95% confidence intervals in square brackets) were: Corr(FROC, inferred ROC) = 0.76[0.64, 0.84] > Corr(inferred ROC, conventional ROC) = 0.40[0.18, 0.58] > Corr (FROC, conventional ROC) = 0.32[0.16, 0.46]. CONCLUSIONS Correlation between FROC and inferred-ROC data AUC estimates was high. Correlation between inferred- and conventional-ROC AUC was similar to the correlation between two modalities for a single reader using one estimation method, suggesting that the highest rating assumption might be questionable.

[1]  Guido Valli,et al.  Neural networks for computer-aided diagnosis: detection of lung nodules in chest radiograms , 2003, IEEE Transactions on Information Technology in Biomedicine.

[2]  David Gur,et al.  Agreement of the order of overall performance levels under different reading paradigms. , 2008, Academic radiology.

[3]  Hilde Bosmans,et al.  An improved method for simulating microcalcifications in digital mammograms. , 2008, Medical physics.

[4]  Luisa P. Wallace,et al.  The "laboratory" effect: comparing radiologists' performance and variability during prospective clinical and laboratory mammography interpretations. , 2008, Radiology.

[5]  R. L. Birdwell Digital Breast Tomosynthesis: A Pilot Observer Study , 2009 .

[6]  Dev P Chakraborty,et al.  Observer studies involving detection and localization: modeling, analysis, and validation. , 2004, Medical physics.

[7]  Hiroyuki Yoshida,et al.  Virtual tagging for laxative-free CT colonography: pilot evaluation. , 2009, Medical physics.

[8]  Berkman Sahiner,et al.  Dual system approach to computer-aided detection of breast masses on mammograms. , 2006, Medical physics.

[9]  Rachel Toomey,et al.  The impact of acoustic noise found within clinical departments on radiology performance. , 2008, Academic radiology.

[10]  M Båth,et al.  Effect of clinical experience of chest tomosynthesis on detection of pulmonary nodules , 2009, Acta radiologica.

[11]  D P Chakraborty A search model and figure of merit for observer data acquired according to the free-response paradigm. , 2006, Physics in medicine and biology.

[12]  Hong-Jun Yoon,et al.  Operating characteristics predicted by models for diagnostic tasks involving lesion localization. , 2008, Medical physics.

[13]  Tae Jung Kim,et al.  Is the Computer-Aided Detection Scheme for Lung Nodule Also Useful in Detecting Lung Cancer? , 2008, Journal of computer assisted tomography.

[14]  A. Flinck,et al.  Comparison of chest tomosynthesis and chest radiography for detection of pulmonary nodules: human observer study of clinical cases. , 2008, Radiology.

[15]  K. Berbaum,et al.  Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. , 1992, Investigative radiology.

[16]  Laurie L Fajardo,et al.  Free-response receiver operating characteristic evaluation of lossy JPEG2000 and object-based set partitioning in hierarchical trees compression of digitized mammograms. , 2005, Radiology.

[17]  J. Baker,et al.  A mathematical model platform for optimizing a multiprojection breast imaging system. , 2008, Medical physics.

[18]  N. Obuchowski,et al.  Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests: An anova approach with dependent observations , 1995 .

[19]  David Gur,et al.  A comparison of two data analyses from two observer performance studies using Jackknife ROC and JAFROC. , 2005, Medical physics.

[20]  Berkman Sahiner,et al.  Computer-aided detection system for clustered microcalcifications: comparison of performance on full-field digital mammograms and digitized screen-film mammograms , 2007, Physics in medicine and biology.

[21]  Ehsan Samei,et al.  Simulation of mammographic lesions. , 2006, Academic radiology.

[22]  Benjamin M. W. Tsui,et al.  Exploring FROC paradigm: initial experience with clinical applications , 2006, SPIE Medical Imaging.

[23]  E. Melhem,et al.  Artificial multiple sclerosis lesions on simulated FLAIR brain MR images: echo time and observer performance in detection. , 2006, Radiology.

[24]  Ehsan Samei,et al.  The quantitative potential for breast tomosynthesis imaging. , 2010, Medical physics.

[25]  Qiang Li,et al.  Usefulness of temporal subtraction images for identification of interval changes in successive whole-body bone scans: JAFROC analysis of radiologists' performance. , 2007, Academic radiology.

[26]  Receiver Operating Characteristic Analysis in Medical Imaging , 2008 .

[27]  D. McClish Analyzing a Portion of the ROC Curve , 1989, Medical decision making : an international journal of the Society for Medical Decision Making.

[28]  H. Ishwaran,et al.  A general class of hierarchical ordinal regression models with applications to correlated roc analysis , 2000 .

[29]  Frank W. Samuelson,et al.  Non-localization and localization ROC analyses using clinically based scoring , 2009, Medical Imaging.

[30]  Stephen L Hillis,et al.  Recent developments in the Dorfman-Berbaum-Metz procedure for multireader ROC study analysis. , 2008, Academic radiology.

[31]  C. Metz,et al.  A receiver operating characteristic partial area index for highly sensitive diagnostic tests. , 1996, Radiology.

[32]  John F. Hamilton,et al.  A Free Response Approach To The Measurement And Characterization Of Radiographic Observer Performance , 1977, Other Conferences.

[33]  Arnau Oliver,et al.  A review of automatic mass detection and segmentation in mammographic images , 2010, Medical Image Anal..

[34]  B. Kiefer,et al.  Diagnosis of hepatic metastasis: comparison of respiration-triggered diffusion-weighted echo-planar MRI and five t2-weighted turbo spin-echo sequences. , 2008, AJR. American journal of roentgenology.

[35]  K S Berbaum,et al.  Monte Carlo validation of a multireader method for receiver operating characteristic discrete rating data: factorial experimental design. , 1998, Academic radiology.

[36]  E. Samei,et al.  Dose dependence of mass and microcalcification detection in digital mammography: free response human observer studies. , 2007, Medical physics.

[37]  Frank W. Samuelson,et al.  Comparing signal-based and case-based methodologies for CAD assessment in a detection task , 2008, SPIE Medical Imaging.

[38]  R. F. Wagner,et al.  Components-of-variance models for random-effects ROC analysis: the case of unequal variance structures across modalities. , 2001, Academic radiology.

[39]  Ce. Metz,et al.  Receiver operating characteristic (ROC) analysis in medical imaging , 1997 .

[40]  B. Efron Jackknife‐After‐Bootstrap Standard Errors and Influence Functions , 1992 .

[41]  Hironobu Nakamura,et al.  Commercially available computer-aided detection system for pulmonary nodules on thin-section images using 64 detectors-row CT: preliminary study of 48 cases. , 2009, Academic radiology.

[42]  S. Hillis A comparison of denominator degrees of freedom methods for multiple observer ROC analysis , 2007, Statistics in medicine.

[43]  R. Swensson Unified measurement of observer performance in detecting and localizing target objects on images. , 1996, Medical physics.

[44]  M. Hellström,et al.  Computer-aided detection (CAD) as a second reader using perspective filet view at CT colonography: effect on performance of inexperienced readers. , 2009, Clinical radiology.

[45]  N A Obuchowski,et al.  Multireader receiver operating characteristic studies: a comparison of study designs. , 1995, Academic radiology.

[46]  Elias R Melhem,et al.  Detection of simulated multiple sclerosis lesions on T2-weighted and FLAIR images of the brain: observer performance. , 2006, Radiology.

[47]  R. F. Wagner,et al.  Components-of-variance models and multiple-bootstrap experiments: an alternative method for random-effects, receiver operating characteristic analysis. , 2000, Academic radiology.

[48]  T. Hirose,et al.  Evaluation of computer-aided diagnosis (CAD) software for the detection of lung nodules on multidetector row computed tomography (MDCT): JAFROC study for the improvement in radiologists' diagnostic accuracy. , 2008, Academic radiology.

[49]  Nancy A Obuchowski,et al.  A comparison of the Dorfman–Berbaum–Metz and Obuchowski–Rockette methods for receiver operating characteristic (ROC) data , 2005, Statistics in medicine.

[50]  K. Berbaum,et al.  Proper receiver operating characteristic analysis: the bigamma model. , 1997, Academic radiology.

[51]  Frank W. Samuelson,et al.  Comparing image detection algorithms using resampling , 2006, 3rd IEEE International Symposium on Biomedical Imaging: Nano to Macro, 2006..

[52]  D. Chakraborty ROC curves predicted by a model of visual search , 2006, Physics in medicine and biology.

[53]  C E Metz,et al.  Gains in Accuracy from Replicated Readings of Diagnostic Images , 1992, Medical decision making : an international journal of the Society for Medical Decision Making.

[54]  Arthur E. Burgess,et al.  Producing lesions for hybrid mammograms: extracted tumors and simulated microcalcifications , 1999, Medical Imaging.

[55]  R F Wagner,et al.  Analysis of uncertainties in estimates of components of variance in multivariate ROC analysis. , 2001, Academic radiology.

[56]  Hilde Bosmans,et al.  Evaluation of clinical image processing algorithms used in digital mammography. , 2009, Medical physics.

[57]  N A Obuchowski,et al.  Multireader, multimodality receiver operating characteristic curve studies: hypothesis testing and sample size estimation using an analysis of variance approach with dependent observations. , 1995, Academic radiology.

[58]  D. DeLong,et al.  Digital mammography: effects of reduced radiation dose on diagnostic performance. , 2007, Radiology.

[59]  Ehsan Samei,et al.  A technique optimization protocol and the potential for dose reduction in digital mammography. , 2010, Medical physics.

[60]  Ehsan Samei,et al.  Quantitative imaging in breast tomosynthesis and CT: Comparison of detection and estimation task performance. , 2010, Medical physics.