A New Implementation of Recursive Feature Elimination Algorithm for Gene Selection from Microarray Data

We proposed a new approach for gene selection and multi-cancer classification based on step-by-step improvement of classification performance (SSiCP).The SSiCP gene selection algorithms were evaluated over the NCI60 and GCM benchmark datasets, with an accuracy of 96.6% and 95.5% in 10-fold cross validation,respectively. Furthermore, the SSiCP outperformed recently published algorithms when applied to another two multi-cancer data sets.Computational evidence indicated that SSiCP can avoid over fitting effectively. Compared with various gene selection algorithms, the implementation of SSiCPis very simple, and all the computational experiments are repeatable.

[1]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[2]  H. Horvitz,et al.  MicroRNA expression profiles classify human cancers , 2005, Nature.

[3]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[4]  D. Wunsch,et al.  Multiclass Cancer Classification Using Semisupervised Ellipsoid ARTMAP and Particle Swarm Optimization with Gene Expression Data , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[5]  Tao Li,et al.  A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression , 2004, Bioinform..

[6]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[7]  Xin Zhou,et al.  MSVM-RFE: extensions of SVM-RFE for multiclass gene selection on DNA microarray data , 2007, Bioinform..

[8]  N. Sampas,et al.  Molecular classification of cutaneous malignant melanoma by gene expression profiling , 2000, Nature.

[9]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[10]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[11]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[12]  Juha Reunanen,et al.  Overfitting in Making Comparisons Between Variable Selection Methods , 2003, J. Mach. Learn. Res..

[13]  R Kahavi,et al.  Wrapper for feature subset selection , 1997 .

[14]  Christian A. Rees,et al.  Systematic variation in gene expression patterns in human cancer cell lines , 2000, Nature Genetics.

[15]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[16]  Randy Goebel,et al.  Selecting dissimilar genes for multi-class classification, an application in cancer subtyping , 2007, BMC Bioinformatics.

[17]  Wei Du,et al.  Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines , 2003, FEBS letters.

[18]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[19]  J. Welsh,et al.  Molecular classification of human carcinomas by use of gene expression signatures. , 2001, Cancer research.