Feature selection for computer-aided polyp detection using MRMR

In building robust classifiers for computer-aided detection (CAD) of lesions, selection of relevant features is of fundamental importance. Typically one is interested in determining which, of a large number of potentially redundant or noisy features, are most discriminative for classification. Searching all possible subsets of features is impractical computationally. This paper proposes a feature selection scheme combining AdaBoost with the Minimum Redundancy Maximum Relevance (MRMR) to focus on the most discriminative features. A fitness function is designed to determine the optimal number of features in a forward wrapper search. Bagging is applied to reduce the variance of the classifier and make a reliable selection. Experiments demonstrate that by selecting just 11 percent of the total features, the classifier can achieve better prediction on independent test data compared to the 70 percent of the total features selected by AdaBoost.

[1]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[3]  Carl-Fredrik Westin,et al.  Tissue Classification Based on 3D Local Intensity Structures for Volume Rendering , 2000, IEEE Trans. Vis. Comput. Graph..

[4]  Bernhard Schölkopf,et al.  Feature selection for support vector machines by means of genetic algorithm , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.

[5]  K. Laws Textured Image Segmentation , 1980 .

[6]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[7]  Ronald M. Summers,et al.  Feature selection for computer-aided polyp detection using genetic algorithms , 2003, SPIE Medical Imaging.

[8]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[9]  Sheng Chen,et al.  A Kernel-Based Two-Class Classifier for Imbalanced Data Sets , 2007, IEEE Transactions on Neural Networks.

[10]  Gregory G. Slabaugh,et al.  A Robust and Fast System for CTC Computer-Aided Detection of Colorectal Lesions , 2010, Algorithms.

[11]  Lilla Böröczky,et al.  Feature Subset Selection for Improving the Performance of False Positive Reduction in Lung Nodule CAD , 2005, IEEE Transactions on Information Technology in Biomedicine.

[12]  Yalin Zheng,et al.  Simultaneous feature selection and classification based on genetic algorithms: an application to colonic polyp detection , 2008, SPIE Medical Imaging.

[13]  A. Dachman,et al.  CT colonography: the next colon screening examination? , 2000, Radiology.

[14]  Hiroyuki Yoshida,et al.  Three-dimensional computer-aided diagnosis scheme for detection of colonic polyps , 2001, IEEE Transactions on Medical Imaging.

[15]  U. G. Dailey Cancer,Facts and Figures about. , 2022, Journal of the National Medical Association.

[16]  A. M. Youssef,et al.  Automated polyp detection at CT colonography: feasibility assessment in a human population. , 2001, Radiology.

[17]  J. Yee,et al.  Accuracy of ct colonography for detection of large adenomas and cancers , 2009 .

[18]  Chong-Ho Choi,et al.  Input feature selection for classification problems , 2002, IEEE Trans. Neural Networks.