Automatic detection of breast cancers in mammograms using structured support vector machines

Breast cancer is one of the most common cancers diagnosed in women. Large margin classifiers like the support vector machine (SVM) have been reported effective in computer-assisted diagnosis systems for breast cancers. However, since the separating hyperplane determination exclusively relies on support vectors, the SVM is essentially a local classifier and its performance can be further improved. In this work, we introduce a structured SVM model to determine if each mammographic region is normal or cancerous by considering the cluster structures in the training set. The optimization problem in this new model can be solved efficiently by being formulated as one second order cone programming problem. Experimental evaluation is performed on the Digital Database for Screening Mammography (DDSM) dataset. Various types of features, including curvilinear features, texture features, Gabor features, and multi-resolution features, are extracted from the sample images. We then select the salient features using the recursive feature elimination algorithm. The structured SVM achieves better detection performance compared with a well-tested SVM classifier in terms of the area under the ROC curve.

[1]  Jos F. Sturm,et al.  A Matlab toolbox for optimization over symmetric cones , 1999 .

[2]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[3]  Yurii Nesterov,et al.  Interior-point polynomial algorithms in convex programming , 1994, Siam studies in applied mathematics.

[4]  E.J. Delp,et al.  A Comparison of Feature Selection Methods for the Detection of Breast Cancers in Mammograms: Adaptive Sequential Floating Search vs. Genetic Algorithm , 2005, 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference.

[5]  Michael Brady,et al.  Mammographic Image Analysis , 1999, Computational Imaging and Vision.

[6]  Michael I. Jordan,et al.  A Robust Minimax Approach to Classification , 2003, J. Mach. Learn. Res..

[7]  Nico Karssemeijer,et al.  Noise equalization for detection of microcalcification clusters in direct digital mammogram images , 2004, IEEE Transactions on Medical Imaging.

[8]  L. Tabár,et al.  Potential contribution of computer-aided detection to the sensitivity of screening mammography. , 2000, Radiology.

[9]  T.O. Gulsrud,et al.  Watershed segmentation of detected masses in digital mammograms , 2005, 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference.

[10]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[11]  Peter Willett,et al.  Comparison of Hierarchie Agglomerative Clustering Methods for Document Retrieval , 1989, Comput. J..

[12]  Philip Chan,et al.  Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[13]  Nikolas P. Galatsanos,et al.  A support vector machine approach for detection of microcalcifications , 2002, IEEE Transactions on Medical Imaging.

[14]  Stephen P. Boyd,et al.  Applications of second-order cone programming , 1998 .

[15]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[16]  I. Olkin,et al.  Multivariate Chebyshev Inequalities , 1960 .

[17]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[18]  Sheng Liu,et al.  Multiresolution detection of spiculated lesions in digital mammograms , 2001, IEEE Trans. Image Process..

[19]  Richard H. Moore,et al.  THE DIGITAL DATABASE FOR SCREENING MAMMOGRAPHY , 2007 .

[20]  R. M. Nishikawa,et al.  Computer-aided detection of clustered microcalcifications on digital mammograms , 1995, Medical and Biological Engineering and Computing.

[21]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[22]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[23]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[24]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[25]  Alexander J. Smola,et al.  A Second Order Cone programming Formulation for Classifying Missing Data , 2004, NIPS.

[26]  Brian Everitt,et al.  Cluster analysis , 1974 .