Computer-aided classification of breast masses: performance and interobserver variability of expert radiologists versus residents.

PURPOSE To evaluate the interobserver variability in descriptions of breast masses by dedicated breast imagers and radiology residents and determine how any differences in lesion description affect the performance of a computer-aided diagnosis (CAD) computer classification system. MATERIALS AND METHODS Institutional review board approval was obtained for this HIPAA-compliant study, and the requirement to obtain informed consent was waived. Images of 50 breast lesions were individually interpreted by seven dedicated breast imagers and 10 radiology residents, yielding 850 lesion interpretations. Lesions were described with use of 11 descriptors from the Breast Imaging Reporting and Data System, and interobserver variability was calculated with the Cohen κ statistic. Those 11 features were selected, along with patient age, and merged together by a linear discriminant analysis (LDA) classification model trained by using 1005 previously existing cases. Variability in the recommendations of the computer model for different observers was also calculated with the Cohen κ statistic. RESULTS A significant difference was observed for six lesion features, and radiology residents had greater interobserver variability in their selection of five of the six features than did dedicated breast imagers. The LDA model accurately classified lesions for both sets of observers (area under the receiver operating characteristic curve = 0.94 for residents and 0.96 for dedicated imagers). Sensitivity was maintained at 100% for residents and improved from 98% to 100% for dedicated breast imagers. For residents, the computer model could potentially improve the specificity from 20% to 40% (P < .01) and the κ value from 0.09 to 0.53 (P < .001). For dedicated breast imagers, the computer model could increase the specificity from 34% to 43% (P = .16) and the κ value from 0.21 to 0.61 (P < .001). CONCLUSION Among findings showing a significant difference, there was greater interobserver variability in lesion descriptions among residents; however, an LDA model using data from either dedicated breast imagers or residents yielded a consistently high performance in the differentiation of benign from malignant breast lesions, demonstrating potential for improving specificity and decreasing interobserver variability in biopsy recommendations.

[1]  J. Baker,et al.  Breast mass lesions: computer-aided diagnosis models with mammographic and sonographic descriptors. , 2007, Radiology.

[2]  D M Ikeda,et al.  Localization and needle aspiration of breast lesions: complications in 370 cases. , 1991, AJR. American journal of roentgenology.

[3]  M. Giger,et al.  Automated computerized classification of malignant and benign masses on digitized mammograms. , 1998, Academic radiology.

[4]  M. Giger,et al.  Breast cancer: effectiveness of computer-aided diagnosis observer study with independent database of mammograms. , 2002, Radiology.

[5]  M. Giger Computerized analysis of images in the detection and diagnosis of breast cancer. , 2004, Seminars in ultrasound, CT, and MR.

[6]  K. Emmons,et al.  Decreasing women's anxieties after abnormal mammograms: a controlled trial. , 2004, Journal of the National Cancer Institute.

[7]  Ellen Kao,et al.  Breast imaging reporting and data system lexicon for US: interobserver agreement for assessment of breast masses. , 2009, Radiology.

[8]  Lubomir M. Hadjiiski,et al.  Improvement of mammographic mass characterization using spiculation meausures and morphological features. , 2001, Medical physics.

[9]  Lubomir M. Hadjiiski,et al.  Characterization of mammographic masses based on level set segmentation with new image features and patient information. , 2007, Medical physics.

[10]  D. Miglioretti,et al.  Physician predictors of mammographic accuracy. , 2005, Journal of the National Cancer Institute.

[11]  J. Dixon,et al.  Morbidity after breast biopsy for benign disease in a screened population , 1992, The Lancet.

[12]  C. Floyd,et al.  Breast cancer: prediction with artificial neural network based on BI-RADS standardized lexicon. , 1995, Radiology.

[13]  Carol H Lee Screening mammography: proven benefit, continued controversy. , 2002, Radiologic clinics of North America.

[14]  D. Cyrlak,et al.  Induced costs of low-cost screening mammography. , 1988, Radiology.

[15]  Ruey-Feng Chang,et al.  Solid breast masses: classification with computer-aided analysis of continuous US images obtained with probe compression. , 2005, Radiology.

[16]  Joseph Y. Lo,et al.  Self-organizing map for cluster analysis of a breast cancer database , 2003, Artif. Intell. Medicine.

[17]  J. Elmore,et al.  Variability in radiologists' interpretations of mammograms. , 1994, The New England journal of medicine.

[18]  Maryellen L Giger,et al.  Performance of computer-aided diagnosis in the interpretation of lesions on breast sonography. , 2004, Academic radiology.

[19]  C. Floyd,et al.  Artificial neural network: improving the quality of breast biopsy recommendations. , 1996, Radiology.

[20]  Catherine M. Crespi,et al.  Predictors of interobserver agreement in breast imaging using the Breast Imaging Reporting and Data System , 2010, Breast Cancer Research and Treatment.

[21]  J. Elmore,et al.  Screening mammograms by community radiologists: variability in false-positive rates. , 2002, Journal of the National Cancer Institute.

[22]  V. Jackson The role of US in breast imaging. , 1990, Radiology.

[23]  Mia K Markey,et al.  Breast cancer CADx based on BI-RAds descriptors from two mammographic views. , 2006, Medical physics.

[24]  M. Mainiero,et al.  BI-RADS lexicon for US and mammography: interobserver variability and positive predictive value. , 2006, Radiology.

[25]  M. Giger,et al.  Multimodality computerized diagnosis of breast lesions using mammography and sonography. , 2005, Academic radiology.

[26]  C. Floyd,et al.  Optimized approach to decision fusion of heterogeneous data for breast cancer diagnosis. , 2006, Medical physics.

[27]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[28]  J W Sayre,et al.  Benign versus malignant solid breast masses: US differentiation. , 1999, Radiology.

[29]  M. J. van de Vijver,et al.  Diagnosis of breast cancer: contribution of US as an adjunct to mammography. , 1999, Radiology.

[30]  Ruey-Feng Chang,et al.  Classification of breast ultrasound images using fractal feature. , 2005, Clinical imaging.

[31]  Ki Keun Oh,et al.  Observer variability of Breast Imaging Reporting and Data System (BI-RADS) for breast ultrasound. , 2008, European journal of radiology.

[32]  C. D'Orsi,et al.  Diagnostic Performance of Digital Versus Film Mammography for Breast-Cancer Screening , 2005, The New England journal of medicine.

[33]  Rebecca S Lewis,et al.  Does training in the Breast Imaging Reporting and Data System (BI-RADS) improve biopsy recommendations or feature analysis agreement with experienced breast imagers at mammography? , 2002, Radiology.

[34]  C. D'Orsi,et al.  Accuracy of screening mammography interpretation by characteristics of radiologists. , 2004, Journal of the National Cancer Institute.

[35]  A. Stavros,et al.  Solid breast nodules: use of sonography to distinguish between benign and malignant lesions. , 1995, Radiology.

[36]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[37]  H. Chan,et al.  Multi-modality CADx: ROC study of the effect on radiologists' accuracy in characterizing breast masses on mammograms and 3D ultrasound images. , 2009, Academic radiology.

[38]  J. Goodwin,et al.  Variation in false-positive rates of mammography reading among 1067 radiologists: a population-based assessment , 2006, Breast Cancer Research and Treatment.

[39]  Berkman Sahiner,et al.  Breast masses: computer-aided diagnosis with serial mammograms. , 2006, Radiology.

[40]  E A Sickles,et al.  Standardized abnormal interpretation and cancer detection ratios to assess reading volume and reader performance in a breast screening program. , 2000, Radiology.

[41]  M. Giger,et al.  Computerized detection and classification of cancer on breast ultrasound. , 2004, Academic radiology.

[42]  C. Floyd,et al.  Differences between computer-aided diagnosis of breast masses and that of calcifications. , 2002, Radiology.

[43]  Berkman Sahiner,et al.  Computer-aided characterization of mammographic masses: accuracy of mass segmentation and its effects on characterization , 2001, IEEE Transactions on Medical Imaging.

[44]  F. Hall,et al.  Nonpalpable breast lesions: recommendations for biopsy based on suspicion of carcinoma at mammography. , 1988, Radiology.

[45]  V. Jackson Management of solid breast nodules: what is the role of sonography? , 1995, Radiology.

[46]  Maryellen L Giger,et al.  Potential effect of different radiologist reporting methods on studies showing benefit of CAD. , 2008, Academic radiology.

[47]  Anna O. Bilska-Wolak,et al.  Computer aid for decision to biopsy breast masses on mammography: validation on new cases. , 2005, Academic radiology.

[48]  M. Giger,et al.  Robustness of computerized lesion detection and classification scheme across different breast US platforms. , 2005, Radiology.

[49]  J. Elmore,et al.  Ten-year risk of false positive screening mammograms and clinical breast examinations. , 1998, The New England journal of medicine.

[50]  Lubomir M. Hadjiiski,et al.  Analysis of temporal changes of mammographic features: computer-aided classification of malignant and benign breast masses. , 2001, Medical physics.

[51]  D. Chen,et al.  Breast cancer diagnosis using self-organizing map for sonography. , 2000, Ultrasound in medicine & biology.

[52]  E. Sickles Periodic mammographic follow-up of probably benign lesions: results in 3,184 consecutive cases. , 1991, Radiology.

[53]  M. Elter,et al.  CADx of mammographic masses and clustered microcalcifications: a review. , 2009, Medical physics.

[54]  Li Lan,et al.  Evaluation of computer-aided diagnosis on a large clinical full-field digital mammographic dataset. , 2008, Academic radiology.

[55]  C. Floyd,et al.  Cross-institutional evaluation of BI-RADS predictive model for mammographic diagnosis of breast cancer. , 2002, AJR. American journal of roentgenology.

[56]  X. Varas,et al.  Nonpalpable, probably benign lesions: role of follow-up mammography. , 1992, Radiology.

[57]  M. Giger,et al.  Improving breast cancer diagnosis with computer-aided diagnosis. , 1999, Academic radiology.

[58]  M. Giger,et al.  Breast US computer-aided diagnosis workstation: performance with a large clinical diagnostic population. , 2008, Radiology.

[59]  D. Chen,et al.  Computer-aided diagnosis for surgical office-based breast ultrasound. , 2000, Archives of surgery.

[60]  X. Castells,et al.  Association between Radiologists' Experience and Accuracy in Interpreting Screening Mammograms , 2008, BMC health services research.