Three compartment breast machine learning model for improving computer-aided detection

Our purpose was to determine if the lipid, water, and protein lesion composition (3CB), combined with computer-aided detection (CAD) had higher biopsy malignancy specificity than CAD alone. High and low-kVp full-field digital 3CB mammograms were acquired on women with suspicious mammographic lesions (BIRADS 4) and that were to undergo biopsy. Radiologists delineated 673 lesions (98 invasive ductal cancers (IDC), 60 ductal carcinomas in situ (DCIS), 103 fibroadenomata (FA), and 412 benign (BN)) on the diagnostic mammograms using the pathology report to confirm location. The diagnostic mammograms were processed by iCAD SecondLook software using its most sensitive setting to create to further delineations and probabilities of malignancy. The iCAD delineated a total of 375 annotation agreeing regions that were classified as either masses or calcification cluster. The 3CB algorithm produced lipid, water, and protein thickness maps for all ROIs and peripheral rings from which 84 compositional input features were derived. A neural network (3CBNN) was trained with cross-validation on 80% of the data to predict the lesion type. Biopsy pathology served as the gold standard outcome. IDC and DCIS predicted probabilities were summed together to obtain a probability of malignancy which was evaluated against the iCAD probabilities using the area under the ROC curves. On a holdout test set, 20% of the data, the iCAD's output alone had an AUC of 0.61 while the 3CBNN’s AUC was 0.73. We conclude that compositional information provided by the 3CB algorithm contains important diagnostic information that can increase specificity of CAD software.