An ensemble feature selection technique for cancer recognition.

Correlation-based feature selection (CFS) using neighborhood mutual information (NMI) and particle swarm optimization (PSO) are combined into an ensemble technique in this paper. Based on this observation, an efficient gene selection algorithm, denoted by NMICFS-PSO, is proposed. Several cancer recognition tasks are gathered for testing the proposed technique. Moreover, support vector machine (SVM), integrated with leave-one-out cross-validation and served as a classifier, is employed for six classification profiles to calculate the classification accuracy. Experimental results show that the proposed method can reduce the redundant features effectively and achieve superior performance. The classification accuracy obtained by our method is higher in five out of the six gene expression problems as compared with that of other classifi cation methods.

[1]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[2]  David E. Misek,et al.  Gene-expression profiles predict survival of patients with lung adenocarcinoma , 2002, Nature Medicine.

[3]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[4]  Xiangyang Wang,et al.  Feature selection based on rough sets and particle swarm optimization , 2007, Pattern Recognit. Lett..

[5]  Qinghua Hu,et al.  Neighborhood classifiers , 2008, Expert Syst. Appl..

[6]  Qinghua Hu,et al.  An efficient gene selection technique for cancer recognition based on neighborhood mutual information , 2010, Int. J. Mach. Learn. Cybern..

[7]  Hong Fang,et al.  Decision forest for classification of gene expression data , 2010, Comput. Biol. Medicine.

[8]  Lei Zhang,et al.  Gene expression data classification using locally linear discriminant embedding , 2010, Comput. Biol. Medicine.

[9]  De-Shuang Huang,et al.  A method of tumor classification based on wavelet packet transforms and neighborhood rough set , 2010, Comput. Biol. Medicine.

[10]  Keun Ho Ryu,et al.  Feature selection method using WF-LASSO for gene expression data analysis , 2011, BCB '11.

[11]  Yue Han,et al.  Stable Gene Selection from Microarray Data via Sample Weighting , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  Satoru Miyano,et al.  A Top-r Feature Selection Algorithm for Microarray Gene Expression Data , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.