An accelerated procedure for recursive feature ranking on microarray data

We describe a new wrapper algorithm for fast feature ranking in classification problems. The Entropy-based Recursive Feature Elimination (E-RFE) method eliminates chunks of uninteresting features according to the entropy of the weights distribution of a SVM classifier. With specific regard to DNA microarray datasets, the method is designed to support computationally intensive model selection in classification problems in which the number of features is much larger than the number of samples. We test E-RFE on synthetic and real data sets, comparing it with other SVM-based methods. The speed-up obtained with E-RFE supports predictive modeling on high dimensional microarray data.

[1]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[2]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[3]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Cesare Furlanello,et al.  Entropy-based gene ranking without selection bias for the predictive classification of microarray data , 2003, BMC Bioinformatics.

[5]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[6]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[7]  Mike West,et al.  Prediction and uncertainty in the analysis of gene expression profiles , 2002, Silico Biol..

[8]  M. Xiong,et al.  Biomarker Identification by Feature Wrappers , 2022 .

[9]  Yi Li,et al.  Bayesian automatic relevance determination algorithms for classifying gene expression data. , 2002, Bioinformatics.

[10]  Xuegong Zhang,et al.  Recursive Sample Classification and Gene Selection based on SVM: Method and Software Description # , 2001 .

[11]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[13]  E. Lander,et al.  A molecular signature of metastasis in primary solid tumors , 2003, Nature Genetics.

[14]  C. Perou,et al.  Molecular portraits and the family tree of cancer , 2002, Nature Genetics.

[15]  D. Slonim From patterns to pathways: gene expression data analysis comes of age , 2002, Nature Genetics.

[16]  Danh V. Nguyen,et al.  Tumor classification by partial least squares using microarray gene expression data , 2002, Bioinform..

[17]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[18]  Sayan Mukherjee,et al.  Feature Selection for SVMs , 2000, NIPS.