Simultaneous feature selection and classification based on genetic algorithms: an application to colonic polyp detection

Selecting a set of relevant features is a crucial step in the process of building robust classifiers. Searching all possible subsets of features is computationally impractical for large number of features. Generally, classifiers are used for the evaluation of the separability of a certain feature subset. The performance of these classifiers depends on some predefined parameters. However, the choice of these parameters for a given classifier is influenced by the given feature subset and vice versa. The computational cost for feature selection would be largely increased by including the selection of optimal parameters for the classifier (for each subset). This paper attempts to tackle the problem by introducing genetic algorithms (GAs) to combine the processes. The proposed approach can choose the most relevant features from a feature set whilst simultaneously optimising the parameters of the classifier. Its performance was tested on a colon polyp database from a cohort study using a weighted support vector machine (SVM) classifier. As a general approach, other classifiers such as artificial neural networks (ANN) and decision trees could be used. This approach could also be applied to other classification problems such as other computer aided detection/diagnosis applications.

[1]  Stuart A. Taylor,et al.  Computed tomographic colonography: assessment of radiologist performance with and without computer-aided detection. , 2006, Gastroenterology.

[2]  Xiaoyun Yang,et al.  Reduction of False Positives in Polyp Detection Using Weighted Support Vector Machines , 2007, 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[3]  Nello Cristianini,et al.  Controlling the Sensitivity of Support Vector Machines , 1999 .

[4]  David Beasley,et al.  An overview of genetic algorithms: Part 1 , 1993 .

[5]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[6]  Neil Gershenfeld,et al.  The nature of mathematical modeling , 1998 .

[7]  Lilla Böröczky,et al.  Feature Subset Selection for Improving the Performance of False Positive Reduction in Lung Nodule CAD , 2005, IEEE Transactions on Information Technology in Biomedicine.

[8]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[9]  Milan Sonka,et al.  Image Processing, Analysis and Machine Vision , 1993, Springer US.

[10]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[11]  Hiroyuki Yoshida,et al.  Three-dimensional computer-aided diagnosis scheme for detection of colonic polyps , 2001, IEEE Transactions on Medical Imaging.

[12]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[13]  James Smith,et al.  A tutorial for competent memetic algorithms: model, taxonomy, and design issues , 2005, IEEE Transactions on Evolutionary Computation.

[14]  K. Laws Textured Image Segmentation , 1980 .

[15]  Federico Girosi,et al.  Support Vector Machines: Training and Applications , 1997 .

[16]  Bull,et al.  An Overview of Genetic Algorithms: Part 2, Research Topics , 1993 .

[17]  Pablo Moscato,et al.  On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts : Towards Memetic Algorithms , 1989 .

[18]  A. M. Youssef,et al.  Automated polyp detection at CT colonography: feasibility assessment in a human population. , 2001, Radiology.

[19]  Bernhard Schölkopf,et al.  Feature selection for support vector machines by means of genetic algorithm , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[20]  Qiang Li,et al.  Selective enhancement filters for nodules, vessels, and airway walls in two- and three-dimensional CT scans. , 2003, Medical physics.

[21]  Jamshid Dehmeshki,et al.  Automatic identification of colonic polyp in high-resolution CT images , 2004, SPIE Medical Imaging.