Observer studies involving detection and localization: modeling, analysis, and validation.

Although the receiver operating characteristic (ROC) paradigm is the accepted method for evaluation of diagnostic imaging systems, it has some serious shortcomings inasmuch as it is restricted to one observer report per image. By contrast the free-response ROC (FROC) paradigm and associated analysis method allows the observer to report multiple abnormalities within each imaging study, and uses the location of reported abnormalities to improve the measurement. Because the ROC method cannot accommodate multiple responses or use location information, its statistical power will suffer. The FROC paradigm/analysis has not enjoyed widespread acceptance because of concern about whether responses made to the same diagnostic study can be treated as independent. We propose a new jackknife FROC analysis method (JAFROC) that does not make the independence assumption. The new analysis method combines elements of FROC and the Dorfman-Berbaum-Metz (DBM) methods. To compare JAFROC to an earlier free-response analysis method (specifically the alternative free-response, or AFROC method), and to the DBM method, which uses conventional ROC scoring, we developed a model for generating simulated FROC data. The simulation model is based on an eye-movement model of how experts evaluate images. It allowed us to examine null hypothesis (NH) behavior and statistical power of the different methods. We found that AFROC analysis did not pass the NH test, being unduly conservative. Both the JAFROC method and the DBM method passed the NH test, but JAFROC had more statistical power than the DBM method. The results of this comparison suggest that future studies of diagnostic performance may enjoy improved statistical power or reduced sample size requirements through the use of the JAFROC method.

[1]  K. Berbaum,et al.  Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. , 1992, Investigative radiology.

[2]  C. Metz,et al.  Visual detection and localization of radiographic images. , 1975, Radiology.

[3]  D. Dorfman,et al.  Maximum-likelihood estimation of parameters of signal-detection theory and determination of confidence intervals—Rating-method data , 1969 .

[4]  C. Nodine,et al.  Using eye movements to study visual search and to improve tumor detection. , 1987, Radiographics : a review publication of the Radiological Society of North America, Inc.

[5]  Jill L. King,et al.  Using incomplete and imprecise localization data on images to improve estimates of detection accuracy , 1999, Medical Imaging.

[6]  H L Kundel,et al.  Visual scanning, pattern recognition and decision-making in pulmonary nodule detection. , 1978, Investigative radiology.

[7]  D P Chakraborty,et al.  Data analysis for detection and localization of multiple abnormalities with application to mammography. , 2000, Academic radiology.

[8]  A. Burgess Comparison of receiver operating characteristic and forced choice observer performance measurement methods. , 1995, Medical physics.

[9]  E. Krupinski,et al.  Visual scanning patterns of radiologists searching mammograms. , 1996, Academic radiology.

[10]  D Gur,et al.  A constrained formulation for the receiver operating characteristic (ROC) curve based on probability summation. , 2001, Medical physics.

[11]  C E Metz,et al.  Variance-component modeling in the analysis of receiver operating characteristic index estimates. , 1997, Academic radiology.

[12]  Dev Chakraborty,et al.  Statistical power in observer-performance studies: comparison of the receiver operating characteristic and free-response methods in tasks involving localization. , 2002, Academic radiology.

[13]  K S Berbaum,et al.  Role of faulty visual search in the satisfaction of search effect in chest radiography. , 1998, Academic radiology.

[14]  D. Chakraborty,et al.  Free-response methodology: alternate analysis and a new observer-performance experiment. , 1990, Radiology.

[15]  C A Roe,et al.  Dorfman-Berbaum-Metz method for statistical analysis of multireader, multimodality receiver operating characteristic data: validation with computer simulation. , 1997, Academic radiology.

[16]  J. Hanley,et al.  A method of comparing the areas under receiver operating characteristic curves derived from the same cases. , 1983, Radiology.

[17]  R. Edward Hendrick,et al.  Mammography quality control manual , 1999 .

[18]  William H. Press,et al.  Numerical Recipes in FORTRAN - The Art of Scientific Computing, 2nd Edition , 1987 .

[19]  C. Metz,et al.  Statistical significance tests for binormal ROC curves , 1980 .

[20]  A. Hillstrom Repetition effects in visual search , 2000, Perception & psychophysics.

[21]  N A Obuchowski,et al.  Data analysis for detection and localization of multiple abnormalities with application to mammography. , 2000, Academic radiology.

[22]  Kevin S. Berbaum,et al.  Satisfaction of search in diagnostic radiology. , 1989 .

[23]  K S Berbaum,et al.  Monte Carlo validation of a multireader method for receiver operating characteristic discrete rating data: factorial experimental design. , 1998, Academic radiology.

[24]  H L Kundel,et al.  Searching for bone fractures: a comparison with pulmonary nodule search. , 1994, Academic radiology.

[25]  Dev P. Chakraborty Proposed solution to the FROC problem and an invitation to collaborate , 2003, SPIE Medical Imaging.

[26]  John F. Hamilton,et al.  A Free Response Approach To The Measurement And Characterization Of Radiographic Observer Performance , 1977, Other Conferences.

[27]  H L Kundel,et al.  Mechanism of satisfaction of search: eye position recordings in the reading of chest radiographs. , 1995, Radiology.

[28]  H L Kundel,et al.  Visual search patterns and experience with radiological images. , 1972, Radiology.

[29]  C. Rutter,et al.  Bootstrap estimation of diagnostic accuracy with patient-clustered data. , 2000, Academic radiology.

[30]  D. C. Barber,et al.  Medical Imaging-The Assessment of Image Quality , 1996 .

[31]  J A Hanley,et al.  Extension of receiver operating characteristic analysis to data concerning multiple signal detection tasks. , 1997, Academic radiology.

[32]  Xiao-Hua Zhou,et al.  Statistical Methods in Diagnostic Medicine , 2002 .

[33]  R. F. Wagner,et al.  Assessment of medical imaging and computer-assist systems: lessons from recent experience. , 2002, Academic radiology.

[34]  R. Swensson Unified measurement of observer performance in detecting and localizing target objects on images. , 1996, Medical physics.

[35]  J. Hanley The Robustness of the "Binormal" Assumptions Used in Fitting ROC Curves , 1988, Medical decision making : an international journal of the Society for Medical Decision Making.

[36]  D P Chakraborty,et al.  Maximum likelihood analysis of free-response receiver operating characteristic (FROC) data. , 1989, Medical physics.

[37]  Kevin S. Berbaum,et al.  Monte Carlo validation of a multireader method for receiver operating characteristic discrete rating data: split-plot experimental design , 1999, Medical Imaging.