A Multisite Study of a Breast Density Deep Learning Model for Full-Field Digital Mammography and Synthetic Mammography.

Purpose To develop a Breast Imaging Reporting and Data System (BI-RADS) breast density deep learning (DL) model in a multisite setting for synthetic two-dimensional mammographic (SM) images derived from digital breast tomosynthesis examinations by using full-field digital mammographic (FFDM) images and limited SM data. Materials and Methods A DL model was trained to predict BI-RADS breast density by using FFDM images acquired from 2008 to 2017 (site 1: 57 492 patients, 187 627 examinations, 750 752 images) for this retrospective study. The FFDM model was evaluated by using SM datasets from two institutions (site 1: 3842 patients, 3866 examinations, 14 472 images, acquired from 2016 to 2017; site 2: 7557 patients, 16 283 examinations, 63 973 images, 2015 to 2019). Each of the three datasets were then split into training, validation, and test. Adaptation methods were investigated to improve performance on the SM datasets, and the effect of dataset size on each adaptation method was considered. Statistical significance was assessed by using CIs, which were estimated by bootstrapping. Results Without adaptation, the model demonstrated substantial agreement with the original reporting radiologists for all three datasets (site 1 FFDM: linearly weighted Cohen κ [κw] = 0.75 [95% CI: 0.74, 0.76]; site 1 SM: κw = 0.71 [95% CI: 0.64, 0.78]; site 2 SM: κw = 0.72 [95% CI: 0.70, 0.75]). With adaptation, performance improved for site 2 (site 1: κw = 0.72 [95% CI: 0.66, 0.79], 0.71 vs 0.72, P = .80; site 2: κw = 0.79 [95% CI: 0.76, 0.81], 0.72 vs 0.79, P < .001) by using only 500 SM images from that site. Conclusion A BI-RADS breast density DL model demonstrated strong performance on FFDM and SM images from two institutions without training on SM images and improved by using few SM images.Supplemental material is available for this article.Published under a CC BY 4.0 license.

[1]  B. Everitt,et al.  Large sample standard errors of kappa and weighted kappa. , 1969 .

[2]  P. Langenberg,et al.  Breast Imaging Reporting and Data System: inter- and intraobserver variability in feature analysis and final assessment. , 2000, AJR. American journal of roentgenology.

[3]  P. Porter,et al.  Breast density as a predictor of mammographic detection: comparison of interval- and screen-detected cancers. , 2000, Journal of the National Cancer Institute.

[4]  J Carpenter,et al.  Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. , 2000, Statistics in medicine.

[5]  V. McCormack,et al.  Breast Density and Parenchymal Patterns as Markers of Breast Cancer Risk: A Meta-analysis , 2006, Cancer Epidemiology Biomarkers & Prevention.

[6]  N. Boyd,et al.  Mammographic density and the risk and detection of breast cancer. , 2007, The New England journal of medicine.

[7]  Andrea J Cook,et al.  Breast cancer risk by breast density, menopause, and postmenopausal hormone therapy use. , 2010, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[8]  Diana L Miglioretti,et al.  Reproducibility of BI‐RADS Breast Density Measures Among Community Radiologists: A Prospective Cohort Study , 2012, The breast journal.

[9]  Andriy I. Bandos,et al.  Comparison of digital mammography alone and digital mammography plus tomosynthesis in a population-based screening program. , 2013, Radiology.

[10]  N Houssami,et al.  Estimation of percentage breast tissue density: comparison between digital mammography (2D full field digital mammography) and digital breast tomosynthesis according to different BI-RADS categories. , 2013, The British journal of radiology.

[11]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[12]  Jules H Sumkin,et al.  Diagnostic accuracy and recall rates for digital mammography and digital mammography combined with one-view and two-view tomosynthesis: results of an enriched reader study. , 2014, AJR. American journal of roentgenology.

[13]  Emily F Conant,et al.  Breast cancer screening using tomosynthesis in combination with digital mammography. , 2014, JAMA.

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Eun Ju Son,et al.  Automated Volumetric Breast Density Measurements in the Era of the BI-RADS Fifth Edition: A Comparison With Visual Assessment. , 2016, AJR. American journal of roentgenology.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[18]  Tianqi Chen,et al.  Training Deep Nets with Sublinear Memory Cost , 2016, ArXiv.

[19]  Emily F. Conant,et al.  Fully Automated Quantitative Estimation of Volumetric Breast Density from Digital Breast Tomosynthesis Images: Preliminary Results and Comparison with Digital Mammography and MR Imaging. , 2016, Radiology.

[20]  Karla Kerlikowske,et al.  Comparison of Clinical and Automated Breast Density Measurements: Implications for Risk Prediction and Supplemental Screening. , 2016, Radiology.

[21]  C. Lehman,et al.  National Performance Benchmarks for Modern Screening Digital Mammography: Update from the Breast Cancer Surveillance Consortium. , 2017, Radiology.

[22]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[23]  Hao Wu,et al.  Mixed Precision Training , 2017, ICLR.

[24]  Daniel Förnvik,et al.  Comparison between software volumetric breast density estimates in breast tomosynthesis and digital mammography images in a large public screening cohort , 2018, European Radiology.

[25]  Nan Wu,et al.  Breast Density Classification with Deep Convolutional Neural Networks , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[26]  Kaiming He,et al.  Group Normalization , 2018, ECCV.

[27]  Yahong Luo,et al.  A deep learning method for classifying mammographic breast density categories , 2018, Medical physics.

[28]  Mark F. McEntee,et al.  BI-RADS density categorization using deep neural networks , 2019, Medical Imaging.

[29]  Lubomir M. Hadjiiski,et al.  Multi-path deep learning model for automated mammographic density categorization , 2019, Medical Imaging.

[30]  Cary P Gross,et al.  Adoption of Digital Breast Tomosynthesis in Clinical Practice. , 2019, JAMA internal medicine.