Comparison of diagnostic performances, case-based repeatability, and operating sensitivity and specificity in classification of breast lesions using DCE-MRI

Understanding repeatability of classification by classifier in the context of overall classification performance and operating points can contribute to improved design of computer-aided diagnosis (CADx). Breast lesions (243 benign, 853 malignant: 1,096 total) were segmented using a fuzzy c-means method from dynamic contrast-enhanced magnetic resonance images acquired over 2005-2015. Thirty-eight radiomic features were extracted. Overall classification performance, case-based classification repeatability, and attainment of ‘preferred’ target and ‘optimal’ sensitivity and specificity were investigated for three classifiers: linear discriminant analysis, support vector machine, and random forest using a 1000-iteration 0.632 bootstrap. The area under the receiver operating characteristic curve (AUC) for the task of classifying lesions as malignant or benign was determined using the 0.632+ bootstrap correction. AUC was compared between classifiers; statistical significance was indicated when the 98.33% confidence interval (CI) of the difference in AUC (corrected for multiple comparisons) excluded zero. Classifier repeatability was determined through 95% CI width of classifier output by case across classifier output range. Classifier output thresholds were determined from the training folds for target sensitivity (95%), target specificity (95%), and for a selected ‘optimal’ operating point determined by minimizing (1-sensitivity)2 + (1-specificity)2 and applied to the test folds. No difference in AUC was observed between the three classifiers. Classifier output, however, was more repeatable when the random forest classifier was used as indicated by a lower 95% CI width of classifier output overall. Moreover, limited differences by classifier in threshold to attain target and ‘optimal’ sensitivities and specificities along with attained sensitivities and specificities were observed. CADx design may benefit from these considerations when selecting which classifier is used.

[1]  Michael Götz,et al.  Prediction of malignancy by a radiomic signature from contrast agent‐free diffusion MRI in suspicious breast lesions found on screening mammography. , 2017, Journal of magnetic resonance imaging : JMRI.

[2]  A. Jemal,et al.  Cancer statistics, 2020 , 2020, CA: a cancer journal for clinicians.

[3]  M L Giger,et al.  Computerized analysis of breast lesions in three dimensions using dynamic magnetic-resonance imaging. , 1998, Medical physics.

[4]  Soyeon Ahn,et al.  How to demonstrate similarity by using noninferiority and equivalence statistical testing in radiology research. , 2013, Radiology.

[5]  Karen Drukker,et al.  Computerized detection of breast cancer on automated breast ultrasound imaging of women with dense breasts. , 2013, Medical physics.

[6]  M. Giger,et al.  Computerized interpretation of breast MRI: investigation of enhancement-variance dynamics. , 2004, Medical physics.

[7]  M. Giger,et al.  Volumetric texture analysis of breast lesions on contrast‐enhanced magnetic resonance images , 2007, Magnetic resonance in medicine.

[8]  Jing Zhang,et al.  Radiomic analysis of contrast-enhanced CT predicts microvascular invasion and outcome in hepatocellular carcinoma. , 2019, Journal of hepatology.

[9]  Karen Drukker,et al.  Repeatability in computer-aided diagnosis: application to breast cancer diagnosis on sonography. , 2010, Medical physics.

[10]  David Jaffray,et al.  Repeatability and reproducibility of MRI-based radiomic features in cervical cancer. , 2019, Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology.

[11]  Maryellen L Giger,et al.  Prevalence scaling: applications to an intelligent workstation for the diagnosis of breast cancer. , 2008, Academic radiology.

[12]  Irène Buvat,et al.  Tumor Texture Analysis in 18F-FDG PET: Relationships Between Texture Parameters, Histogram Indices, Standardized Uptake Values, Metabolic Volumes, and Total Lesion Glycolysis , 2014, The Journal of Nuclear Medicine.

[13]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[14]  Karen Drukker,et al.  Repeatability profiles towards consistent sensitivity and specificity levels for machine learning on breast DCE-MRI , 2020, Medical Imaging.

[15]  M. Giger,et al.  A fuzzy c-means (FCM)-based approach for computerized segmentation of breast lesions in dynamic contrast-enhanced MR images. , 2006, Academic radiology.

[16]  M. Giger,et al.  Automatic identification and classification of characteristic kinetic curves of breast lesions on DCE-MRI. , 2006, Medical physics.

[17]  Maryellen L Giger,et al.  Update on the potential of computer-aided diagnosis for breast cancer. , 2010, Future oncology.

[18]  Lubomir M. Hadjiiski,et al.  Classifier performance prediction for computer-aided diagnosis using a limited dataset. , 2008, Medical physics.