Classification of mammographic masses using support vector machines and Bayesian networks

In this paper, we compare two state-of-the-art classification techniques characterizing masses as either benign or malignant, using a dataset consisting of 271 cases (131 benign and 140 malignant), containing both a MLO and CC view. For suspect regions in a digitized mammogram, 12 out of 81 calculated image features have been selected for investigating the classification accuracy of support vector machines (SVMs) and Bayesian networks (BNs). Additional techniques for improving their performance were included in their comparison: the Manly transformation for achieving a normal distribution of image features and principal component analysis (PCA) for reducing our high-dimensional data. The performance of the classifiers were evaluated with Receiver Operating Characteristics (ROC) analysis. The classifiers were trained and tested using a k-fold cross-validation test method (k=10). It was found that the area under the ROC curve (Az) of the BN increased significantly (p=0.0002) using the Manly transformation, from Az = 0.767 to Az = 0.795. The Manly transformation did not result in a significant change for SVMs. Also the difference between SVMs and BNs using the transformed dataset was not statistically significant (p=0.78). Applying PCA resulted in an improvement in classification accuracy of the naive Bayesian classifier, from Az = 0.767 to Az = 0.786. The difference in classification performance between BNs and SVMs after applying PCA was small and not statistically significant (p=0.11).

[1]  Nico Karssemeijer,et al.  Detection of stellate distortions in mammograms , 1996, IEEE Trans. Medical Imaging.

[2]  C. Metz,et al.  Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously-distributed data. , 1998, Statistics in medicine.

[3]  N Karssemeijer,et al.  Automated classification of parenchymal patterns in mammograms. , 1998, Physics in medicine and biology.

[4]  N Karssemeijer,et al.  Use of border information in the classification of mammographic masses , 2006, Physics in medicine and biology.

[5]  David G. Stork,et al.  Pattern Classification , 1973 .

[6]  Susan M. Astley,et al.  Sites of Occurrence of Malignancies in Mammograms , 1998, Digital Mammography / IWDM.

[7]  Ross D. Shachter,et al.  A Bayesian network for mammography , 2000, AMIA.

[8]  Kevin Murphy,et al.  Bayes net toolbox for Matlab , 1999 .

[9]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[10]  Martin O. Leach,et al.  The UK MARIBS Breast Screening Study: Evaluation of radiological features for breast tumour classification in clinical screening with machine learning methods , 2005, Artif. Intell. Medicine.

[11]  S. Timp,et al.  Analysis of Temporal Mammogram Pairs to Detect and Characterise Mass Lesions. , 2006 .

[12]  Nico Karssemeijer,et al.  Thickness correction of mammographic images by anisotropic filtering and interpolation of dense tissue , 2005, SPIE Medical Imaging.

[13]  N. Karssemeijer,et al.  A new 2D segmentation method based on dynamic programming applied to computer aided detection in mammography. , 2004, Medical physics.

[14]  Harris Georgiou,et al.  Significance analysis of qualitative mammographic features, using linear classifiers, neural networks and support vector machines. , 2005, European journal of radiology.

[15]  Jill L. King,et al.  Computer-assisted diagnosis of breast cancer using a data-driven Bayesian belief network , 1999, Int. J. Medical Informatics.

[16]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[17]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[18]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[19]  N. Karssemeijer,et al.  An automatic method to discriminate malignant masses from normal tissue in digital mammograms1 , 2000, Physics in medicine and biology.

[20]  C. Metz,et al.  A New Approach for Testing the Significance of Differences Between ROC Curves Measured from Correlated Data , 1984 .

[21]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.