Optimization of breast mass classification using sequential forward floating selection (SFFS) and a support vector machine (SVM) model

PurposeImproving radiologists’ performance in classification between malignant and benign breast lesions is important to increase cancer detection sensitivity and reduce false-positive recalls. For this purpose, developing computer-aided diagnosis schemes has been attracting research interest in recent years. In this study, we investigated a new feature selection method for the task of breast mass classification.MethodsWe initially computed 181 image features based on mass shape, spiculation, contrast, presence of fat or calcifications, texture, isodensity, and other morphological features. From this large image feature pool, we used a sequential forward floating selection (SFFS)-based feature selection method to select relevant features and analyzed their performance using a support vector machine (SVM) model trained for the classification task. On a database of 600 benign and 600 malignant mass regions of interest, we performed the study using a tenfold cross-validation method. Feature selection and optimization of the SVM parameters were conducted on the training subsets only.ResultsThe area under the receiver operating characteristic curve $$(\hbox {AUC}) = 0.805\pm 0.012$$(AUC)=0.805±0.012 was obtained for the classification task. The results also showed that the most frequently selected features by the SFFS-based algorithm in tenfold iterations were those related to mass shape, isodensity, and presence of fat, which are consistent with the image features frequently used by radiologists in the clinical environment for mass classification. The study also indicated that accurately computing mass spiculation features from the projection mammograms was difficult, and failed to perform well for the mass classification task due to tissue overlap within the benign mass regions.ConclusionIn conclusion, this comprehensive feature analysis study provided new and valuable information for optimizing computerized mass classification schemes that may have potential to be useful as a “second reader” in future clinical practice.

[1]  Martin D. Fox,et al.  Classifying mammographic lesions using computerized image analysis , 1993, IEEE Trans. Medical Imaging.

[2]  Mia K Markey,et al.  Correspondence in texture features between two mammographic views. , 2005, Medical physics.

[3]  Susan M. Astley,et al.  Linear structures in mammographic images: detection and classification , 2004, IEEE Transactions on Medical Imaging.

[4]  Bin Zheng,et al.  A new mass classification system derived from multiple features and a trained MLP model , 2014, Medical Imaging.

[5]  J Benichou,et al.  Proportion of breast cancer cases in the United States explained by well-established risk factors. , 1995, Journal of the National Cancer Institute.

[6]  N. Petrick,et al.  Design of a high-sensitivity classifier based on a genetic algorithm: application to computer-aided diagnosis , 1998, Physics in medicine and biology.

[7]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[8]  Kunio Doi,et al.  Selective enhancement filters for lung nodules, intracranial aneurysms, and breast microcalcifications , 2004, CARS.

[9]  Rangaraj M. Rangayyan,et al.  Classification of breast masses in mammograms using genetic programming and feature selection , 2006, Medical and Biological Engineering and Computing.

[10]  N. Petrick,et al.  Computerized characterization of masses on mammograms: the rubber band straightening transform and texture analysis. , 1998, Medical physics.

[11]  Georgia D. Tourassi,et al.  Bilateral Breast Volume Asymmetry in Screening Mammograms as a Potential Marker of Breast Cancer: Preliminary Experience , 2007, 2007 IEEE International Conference on Image Processing.

[12]  M. Elter,et al.  CADx of mammographic masses and clustered microcalcifications: a review. , 2009, Medical physics.

[13]  Dietrich Paulus,et al.  Model-Based Characterization of Mammographic Masses , 2009, Bildverarbeitung für die Medizin.

[14]  Xiangyang Xu,et al.  Automated Detection of Breast Mass Spiculation Levels and Evaluation of Scheme Performance 1 , 2022 .

[15]  Bin Zheng,et al.  Improving performance of content-based image retrieval schemes in searching for similar breast mass regions: an assessment , 2009, Physics in medicine and biology.

[16]  Berkman Sahiner,et al.  Computer-aided characterization of mammographic masses: accuracy of mass segmentation and its effects on characterization , 2001, IEEE Transactions on Medical Imaging.

[17]  D. Wolverton,et al.  Performance parameters for screening and diagnostic mammography: specialist and general radiologists. , 2002, Radiology.

[18]  Qiang Li,et al.  Selective enhancement filters for nodules, vessels, and airway walls in two- and three-dimensional CT scans. , 2003, Medical physics.

[19]  T. Franquet,et al.  Spiculated lesions of the breast: mammographic-pathologic correlation. , 1993, Radiographics : a review publication of the Radiological Society of North America, Inc.

[20]  Warren B. Powell,et al.  Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics) , 2007 .

[21]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[22]  C. D'Orsi,et al.  Mammographic feature analysis. , 1993, Seminars in roentgenology.

[23]  Joseph Y. Lo,et al.  Breast cancer classification improvements using a new kernel function with evolutionary-programming-configured support vector machines , 2004, SPIE Medical Imaging.

[24]  David Gur,et al.  A method to improve visual similarity of breast masses for an interactive computer-aided diagnosis environment. , 2005, Medical physics.

[25]  Ian W. Ricketts,et al.  The Mammographic Image Analysis Society digital mammogram database , 1994 .

[26]  Lubomir M. Hadjiiski,et al.  Characterization of mammographic masses based on level set segmentation with new image features and patient information. , 2007, Medical physics.

[27]  David Gur,et al.  Computer-aided detection schemes: the effect of limiting the number of cued regions in each case. , 2004, AJR. American journal of roentgenology.

[28]  M. Giger,et al.  Analysis of spiculation in the computerized classification of mammographic masses. , 1995, Medical physics.

[29]  David Gur,et al.  Computer-aided detection performance in mammographic examination of masses: assessment. , 2004, Radiology.

[30]  Jan Cornelis,et al.  A novel computer-aided lung nodule detection system for CT images. , 2011, Medical physics.

[31]  Xiaoou Tang,et al.  Texture information in run-length matrices , 1998, IEEE Trans. Image Process..

[32]  David Gur,et al.  Computerized assessment of tissue composition on digitized mammograms. , 2002, Academic radiology.

[33]  Jan Cornelis,et al.  Phased searching with NEAT in a Time-Scaled Framework: Experiments on a computer-aided detection system for lung nodules , 2013, Artif. Intell. Medicine.

[34]  David Gur,et al.  Computer-aided detection; the effect of training databases on detection of subtle breast masses. , 2010, Academic radiology.

[35]  Y H Chang,et al.  Computerized detection of masses in digitized mammograms using single-image segmentation and a multilayer topographic feature analysis. , 1995, Academic radiology.

[36]  Rangaraj M. Rangayyan,et al.  DETECTION AND CLASSIFICATION OF MAMMOGRAPHIC CALCIFICATIONS , 1993 .

[37]  Nikos Dimitropoulos,et al.  Mammographic masses characterization based on localized texture and dataset fractal analysis using linear, neural and support vector machine classifiers , 2006, Artif. Intell. Medicine.

[38]  B. Zheng,et al.  Assessment of performance improvement in content-based medical image retrieval schemes using fractal dimension. , 2009, Academic radiology.

[39]  C. Vyborny,et al.  Breast cancer: importance of spiculation in computer-aided detection. , 2000, Radiology.

[40]  P. Burman A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods , 1989 .

[41]  R. Rangayyan,et al.  Boundary modelling and shape analysis methods for classification of mammographic masses , 2000, Medical and Biological Engineering and Computing.

[42]  Bart M. ter Haar Romeny,et al.  Front-End Vision and Multi-Scale Image Analysis , 2003, Computational Imaging and Vision.

[43]  Rangaraj M. Rangayyan,et al.  Measures of acutance and shape for classification of breast tumors , 1997, IEEE Transactions on Medical Imaging.

[44]  Alan C. Bovik,et al.  Evidence based detection of spiculated masses and architectural distortions , 2005, SPIE Medical Imaging.

[45]  E. Sickles Breast masses: mammographic evaluation. , 1989, Radiology.

[46]  J. M. Pruneda,et al.  Computer-aided mammographic screening for spiculated lesions. , 1994, Radiology.

[47]  Berkman Sahiner,et al.  Classification of malignant and benign masses based on hybrid ART2LDA approach , 1999, IEEE Transactions on Medical Imaging.

[48]  Harris Georgiou,et al.  Significance analysis of qualitative mammographic features, using linear classifiers, neural networks and support vector machines. , 2005, European journal of radiology.

[49]  K. J. Ray Liu,et al.  Computerized radiographic mass detection. I. Lesion site selection by morphological enhancement and contextual segmentation , 2001, IEEE Transactions on Medical Imaging.

[50]  Mia K Markey,et al.  A model-based framework for the detection of spiculated masses on mammography. , 2008, Medical physics.

[51]  Warren B. Powell,et al.  “Approximate dynamic programming: Solving the curses of dimensionality” by Warren B. Powell , 2007, Wiley Series in Probability and Statistics.

[52]  Constantine Kotropoulos,et al.  Information Loss of the Mahalanobis Distance in High Dimensions: Application to Feature Selection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Kee Tung. Wong,et al.  Texture features for image classification and retrieval. , 2002 .

[54]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[55]  N Karssemeijer,et al.  Use of border information in the classification of mammographic masses , 2006, Physics in medicine and biology.

[56]  M L Giger,et al.  Computerized classification of benign and malignant masses on digitized mammograms: a study of robustness. , 2000, Academic radiology.

[57]  P. Langley Selection of Relevant Features in Machine Learning , 1994 .

[58]  Meng Joo Er,et al.  Classification of mammographic masses using generalized dynamic fuzzy neural networks , 2003, SPIE Medical Imaging.

[59]  Arnau Oliver,et al.  A review of automatic mass detection and segmentation in mammographic images , 2010, Medical Image Anal..

[60]  M. Giger,et al.  Automated computerized classification of malignant and benign masses on digitized mammograms. , 1998, Academic radiology.

[61]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[62]  C. Rekha,et al.  Approaches For Automated Detection And Classification Of Masses In Mammograms , 2014 .

[63]  Luisa P. Wallace,et al.  Multiview-based computer-aided detection scheme for breast masses. , 2006, Medical physics.

[64]  N. Dubrawsky Cancer statistics , 1989, CA: a cancer journal for clinicians.

[65]  Mia K Markey,et al.  Breast cancer CADx based on BI-RAds descriptors from two mammographic views. , 2006, Medical physics.

[66]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[67]  L. Bruce,et al.  Classifying mammographic mass shapes using the wavelet transform modulus-maxima method , 1999, IEEE Transactions on Medical Imaging.

[68]  Richard H. Moore,et al.  Current Status of the Digital Database for Screening Mammography , 1998, Digital Mammography / IWDM.

[69]  Rangaraj M. Rangayyan,et al.  A review of computer-aided diagnosis of breast cancer: Toward the detection of subtle signs , 2007, J. Frankl. Inst..

[70]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[71]  Sergios Theodoridis,et al.  Introduction to Pattern Recognition: A Matlab Approach , 2010 .

[72]  A. Retico,et al.  An automatic system to discriminate malignant from benign massive lesions on mammograms , 2006 .

[73]  H P Chan,et al.  Combined adaptive enhancement and region-growing segmentation of breast masses on digitized mammograms. , 1999, Medical physics.

[74]  N. Karssemeijer,et al.  An automatic method to discriminate malignant masses from normal tissue in digital mammograms1 , 2000, Physics in medicine and biology.

[75]  Rangaraj M. Rangayyan,et al.  Gradient and texture analysis for the classification of mammographic masses , 2000, IEEE Transactions on Medical Imaging.

[76]  D. Evans,et al.  Assessing women at high risk of breast cancer: a review of risk assessment models. , 2010, Journal of the National Cancer Institute.

[77]  A. Jemal,et al.  Cancer statistics, 2013 , 2013, CA: a cancer journal for clinicians.

[78]  Constantine Kotropoulos,et al.  Fast and accurate sequential floating forward feature selection with the Bayes classifier applied to speech emotion recognition , 2008, Signal Process..

[79]  Alexander Horsch,et al.  Needs assessment for next generation computer-aided mammography reference image databases and evaluation studies , 2011, International Journal of Computer Assisted Radiology and Surgery.