Selection of subtle cases for observer-performance studies: the importance of knowing the true diagnosis.

RATIONALE AND OBJECTIVES To assess the usefulness of classifying degree of difficulty in abnormality detection and to determine the effect of knowing the true diagnosis when selecting subtle images for observer-performance studies. MATERIALS AND METHODS A total of 529 posteroanterior chest images that had been used in a multiabnormality, multireader observer-performance study were rated by three observers as to the difficulty of determining the presence or absence of each abnormality when the true diagnosis was known and when it was not known. Changes in image subtlety ratings were evaluated, and actual observer-performance results for the different groups of images grouped according to raters' classifications with and without availability of the true diagnosis were compared. RESULTS The majority of negative cases (9,168 of 12,258, 74.8%) were rated as "easy" to determine. Substantial changes were made during the selection of the "subtle" case category when the truth was known compared with when the truth was not provided. These changes caused differences between typical and subtle cases in terms of observer performance. Combined ratings of case subtlety by agreement of multiple classifiers resulted in a well-ordered selection with decreasing observer performance as a function of subtlety ratings. CONCLUSION Cases for observer-performance studies that stress the diagnostic system can be successfully selected in the multiple-disease setting by experienced readers and should be selected with the truth known to the raters. The degree of agreement by multiple raters can be used to refine subtlety ratings.

[1]  G G Cox,et al.  Chest radiography: comparison of high-resolution digital displays with conventional and digital film. , 1990, Radiology.

[2]  W. Willett,et al.  Misinterpretation and misuse of the kappa statistic. , 1987, American journal of epidemiology.

[3]  J A Swets,et al.  Enhancing and Evaluating Diagnostic Accuracy , 1991, Medical decision making : an international journal of the Society for Medical Decision Making.

[4]  C E Metz,et al.  Some practical issues of experimental design and data analysis in radiological ROC studies. , 1989, Investigative radiology.

[5]  H E Rockette,et al.  Imaging systems evaluation: effect of subtle cases on the design and analysis of receiver operating characteristic studies. , 1995, AJR. American journal of roentgenology.

[6]  H E Rockette,et al.  Receiver operating characteristic analysis of chest image interpretation with conventional, laser-printed, and high-resolution workstation images. , 1990, Radiology.

[7]  Jill L. King,et al.  Variability in reader selection of subtle cases for inclusion in a reduced size receiver operating characteristic (ROC) study , 1995, Medical Imaging.

[8]  M. Aickin Maximum likelihood estimation of agreement in the constant predictive probability model, and its relation to Cohen's kappa. , 1990, Biometrics.

[9]  Jill L. King,et al.  Selection of subtle cases for ROC studies , 1992 .

[10]  C A Britton,et al.  Digital radiography and conventional imaging of the chest: a comparison of observer performance. , 1994, AJR. American journal of roentgenology.

[11]  Jill L. King,et al.  Relationship of subjective ratings of image quality and observer performance , 1997, Medical Imaging.

[12]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[13]  H. Kraemer,et al.  2 x 2 kappa coefficients: measures of agreement or association. , 1989, Biometrics.