Predictors of interobserver agreement in breast imaging using the Breast Imaging Reporting and Data System

The Breast Imaging Reporting and Data System (BI-RADS) was introduced in 1993 to standardize the interpretation of mammograms. Though many studies have assessed the validity of the system, fewer have examined its reliability. Our objective is to identify predictors of reliability as measured by the kappa statistic. We identified studies conducted between 1993 and 2009 which reported kappa values for interpreting mammograms using any edition of BI-RADS. Bivariate and multivariate multilevel analyses were used to examine associations between potential predictors and kappa values. We identified ten eligible studies, which yielded 88 kappa values for the analysis. Potential predictors of kappa included: whether or not the study included negative cases, whether single- or two-view mammograms were used, whether or not mammograms were digital versus screen-film, whether or not the fourth edition of BI-RADS was utilized, the BI-RADS category being evaluated, whether or not readers were trained, whether or not there was an overlap in readers’ professional activities, the number of cases in the study and the country in which the study was conducted. Our best multivariate model identified training, use of two-view mammograms and BI-RADS categories (masses, calcifications, and final assessments) as predictors of kappa. Training, use of two-view mammograms and focusing on mass description may be useful in increasing reliability in mammogram interpretation. Calcification and final assessment descriptors are areas for potential improvement. These findings are important for implementing policies in BI-RADS use before introducing the system in different settings and improving current implementations.

[1]  Martin J Yaffe,et al.  Digital mammography. , 2005, Radiology.

[2]  Mia K Markey,et al.  Breast cancer CADx based on BI-RAds descriptors from two mammographic views. , 2006, Medical physics.

[3]  C. Floyd,et al.  Artificial neural network: improving the quality of breast biopsy recommendations. , 1996, Radiology.

[4]  M. Gülsün,et al.  Evaluation of breast microcalcifications according to Breast Imaging Reporting and Data System criteria and Le Gal's classification. , 2003, European journal of radiology.

[5]  Andrew Gelman,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2006 .

[6]  P Kanavos,et al.  The rising burden of cancer in the developing world. , 2006, Annals of oncology : official journal of the European Society for Medical Oncology.

[7]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[8]  J. Damilakis,et al.  Film-Screen Magnification Versus Electronic Magnification and Enhancement of Digitized Contact Mammograms in the Assessment of Subtle Microcalcifications , 2001, Investigative radiology.

[9]  C. Balu-Maestro,et al.  Value of MRI in the surgical planning of invasive lobular breast carcinoma: a prospective and a retrospective study of 57 cases: comparison with physical examination, conventional imaging, and histology. , 2007, Clinical imaging.

[10]  C. D'Orsi,et al.  Diagnostic Performance of Digital versus Film Mammography for Breast-Cancer Screening , 2006 .

[11]  R E Hendrick,et al.  Rates and causes of disagreement in interpretation of full-field digital mammography and film-screen mammography in a diagnostic setting. , 2001, AJR. American journal of roentgenology.

[12]  Z. S. Cosar,et al.  Concordance of mammographic classifications of microcalcifications in breast cancer diagnosis: Utility of the Breast Imaging Reporting and Data System (fourth edition). , 2005, Clinical imaging.

[13]  K. Kerlikowske,et al.  Variability and accuracy in mammographic interpretation using the American College of Radiology Breast Imaging Reporting and Data System. , 1998, Journal of the National Cancer Institute.

[14]  R Janka,et al.  Automated breast ultrasound: lesion detection and BI-RADS classification--a pilot study. , 2008, RoFo : Fortschritte auf dem Gebiete der Rontgenstrahlen und der Nuklearmedizin.

[15]  C. Düber,et al.  [Second reading of breast imaging at the hospital department of radiology: reasonable or waste of money?]. , 2006, RoFo : Fortschritte auf dem Gebiete der Rontgenstrahlen und der Nuklearmedizin.

[16]  R. Birdwell,et al.  Comparison of Digital Mammography and Screen-Film Mammography in Breast Cancer Screening: A Review in the Irish Breast Screening Program , 2010 .

[17]  M. Eijkemans,et al.  Mammography: interobserver variability in breast density assessment. , 2007, Breast.

[18]  N Houssami,et al.  Reader variability in reporting breast imaging according to BI-RADS assessment categories (the Florence experience). , 2006, Breast.

[19]  Pierre-Edouard Sottas,et al.  Semiautomatic mammographic parenchymal patterns classification using multiple statistical features. , 2007, Academic radiology.

[20]  J. Hendriks,et al.  Reproducibility of mammographic classifications for non-palpable suspect lesions with microcalcifications. , 2004, The British journal of radiology.

[21]  P. Skaane,et al.  Observer variability in screen-film mammography versus full-field digital mammography with soft-copy reading , 2008, European Radiology.

[22]  Heang-Ping Chan,et al.  Mammographic density measured with quantitative computer-aided method: comparison with radiologists' estimates and BI-RADS categories. , 2006, Radiology.

[23]  Dimitris Rizopoulos,et al.  The logistic transform for bounded outcome scores. , 2007, Biostatistics.

[24]  C. Floyd,et al.  Breast imaging reporting and data system standardized mammography lexicon: observer variability in lesion description. , 1996, AJR. American journal of roentgenology.

[25]  P. Lertsithichai,et al.  Positive predictive value of breast cancer in the lesions categorized as BI-RADS category 5. , 2006, Journal of the Medical Association of Thailand = Chotmaihet thangphaet.

[26]  P. Langenberg,et al.  Breast Imaging Reporting and Data System: inter- and intraobserver variability in feature analysis and final assessment. , 2000, AJR. American journal of roentgenology.

[27]  M. Schmidt,et al.  Mammadiagnostische Zweitbefundung in der Radiologischen Klinik: sinnvoller Mehraufwand oder Ressourcenverschwendung? , 2006 .

[28]  Mark B Dignan,et al.  Concordance of breast imaging reporting and data system assessments and management recommendations in screening mammography. , 2002, Radiology.

[29]  Rebecca S Lewis,et al.  Does training in the Breast Imaging Reporting and Data System (BI-RADS) improve biopsy recommendations or feature analysis agreement with experienced breast imagers at mammography? , 2002, Radiology.

[30]  M. Mainiero,et al.  BI-RADS lexicon for US and mammography: interobserver variability and positive predictive value. , 2006, Radiology.

[31]  R. Sinha,et al.  Opportunities for cancer epidemiology in developing countries , 2004, Nature Reviews Cancer.

[32]  S. Ciatto,et al.  Categorizing breast mammographic density: intra- and interobserver reproducibility of BI-RADS density categories. , 2005, Breast.

[33]  M. Tsuboi,et al.  Comparison of screen-film and full-field digital mammography in Japanese population-based screening. , 2004, Radiation medicine.

[34]  T. Fischer,et al.  Real-time sonoelastography performed in addition to B-mode ultrasound and mammography: improved differentiation of breast lesions? , 2006, Academic radiology.

[35]  P. Skaane,et al.  Randomized trial of screen-film versus full-field digital mammography with soft-copy reading in population-based screening program: follow-up and final results of Oslo II study. , 2007, Radiology.

[36]  K. Winzer,et al.  Real‐time elastography — an advanced method of ultrasound: first results in 108 patients with breast lesions , 2006, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.