Information theory optimization based feature selection in breast mammography lesion classification

Quantitative imaging features of intensity, texture, and shape were extracted from breast lesions and surrounding tissue in 287 mammograms (150 malignant, 137 benign). A feature set reduction method to remove highly intra-correlated features was devised using k-medoids clustering and k-fold cross validation. A novel feature selection method using information theory was introduced which builds a feature set for classification by determining a group of class-informative features with low set co-information. An artificial neural network was built from the selected feature set using 10-hidden layer nodes and the tanh activation function. The resulting computer-aided diagnosis tool achieved a training accuracy of 96.2%, sensitivity of 97.6%, specificity of 95.2%, and area-under-the-curve of 0.971 along with 97.1% sensitivity and 94.9% specificity a blinded validation set.

[1]  Lubomir M. Hadjiiski,et al.  Computer-aided diagnosis of pulmonary nodules on CT scans: segmentation and classification using 3D active contours. , 2006, Medical physics.

[2]  Brijesh Verma,et al.  A novel soft cluster neural network for the classification of suspicious areas in digital mammograms , 2009, Pattern Recognit..

[3]  Shigang Liu,et al.  A local region-based Chan-Vese model for image segmentation , 2012, Pattern Recognit..

[4]  Stephen M. Moore,et al.  The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository , 2013, Journal of Digital Imaging.

[5]  Yunsong Li,et al.  Breast mass classification in digital mammography based on extreme learning machine , 2016, Neurocomputing.

[6]  Belur V. Dasarathy,et al.  Image characterizations based on joint gray level-run length distributions , 1991, Pattern Recognit. Lett..

[7]  Robert King,et al.  Textural features corresponding to textural properties , 1989, IEEE Trans. Syst. Man Cybern..

[8]  Mary M. Galloway,et al.  Texture analysis using gray level run lengths , 1974 .

[9]  Johanna Uthoff,et al.  Improved pulmonary nodule classification utilizing quantitative lung parenchyma features , 2015, Journal of medical imaging.

[10]  Qaisar Abbas,et al.  DeepCAD: A Computer-Aided Diagnosis System for Mammographic Masses Using Deep Invariant Features , 2016, Comput..

[11]  A. D. Van den Abbeele,et al.  Revised RECIST guideline version 1.1: What oncologists want to know and what radiologists need to know. , 2010, AJR. American journal of roentgenology.

[12]  M. Arfan,et al.  Deep Learning based Computer Aided Diagnosis System for Breast Mammograms , 2017 .

[13]  Xinbo Gao,et al.  A parasitic metric learning net for breast mass classification based on mammography , 2018, Pattern Recognit..

[14]  Elmar Kotter,et al.  External validation of a publicly available computer assisted diagnostic tool for mammographic mass lesions with two high prevalence research datasets. , 2015, Medical physics.

[15]  E. Conant,et al.  Beyond breast density: a review on the advancing role of parenchymal texture analysis in breast cancer risk assessment , 2016, Breast Cancer Research.

[16]  Yu Zhang,et al.  Using BI-RADS Descriptors and Ensemble Learning for Classifying Masses in Mammograms , 2009, MCBR-CDS.

[17]  James F. Greenleaf,et al.  Use of gray value distribution of run lengths for texture analysis , 1990, Pattern Recognit. Lett..

[18]  William J. McGill Multivariate information transmission , 1954, Trans. IRE Prof. Group Inf. Theory.

[19]  Bernard Fertil,et al.  Shape and Texture Indexes Application to Cell nuclei Classification , 2013, Int. J. Pattern Recognit. Artif. Intell..

[20]  Kenneth I. Laws,et al.  Rapid Texture Identification , 1980, Optics & Photonics.

[21]  Cecilia L Mercado,et al.  BI-RADS update. , 2014, Radiologic clinics of North America.

[22]  A. Jemal,et al.  Cancer statistics, 2017 , 2017, CA: a cancer journal for clinicians.

[23]  D. Campos-Outcalt USPSTF recommendations: A 2017 roundup. , 2017, The Journal of family practice.