Boosting for feature selection for microarray data analysis

We have investigated the use of boosting techniques for feature selection for microarray data analysis. We propose a novel algorithm for feature selection and have tested it on three datasets. The results clearly show that our boosting technique for feature selection outperformed the Wilcoxon-Mann-Whitney U-test commonly used in microarray data analysis, and produced more accurate boosting ensembles when they were constructed with the features selected by our technique.

[1]  Peter Bühlmann,et al.  Boosting for Tumor Classification with Gene Expression Data , 2003, Bioinform..

[2]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[3]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[4]  Marcel Dettling,et al.  BagBoosting for tumor classification with gene expression data , 2004, Bioinform..

[5]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[6]  Kenji Nishida,et al.  Kernel Feature Selection to Improve Generalization Performance of Boosting Classifiers , 2006, IPCV.

[7]  Dimitris K. Tasoulis,et al.  Unsupervised clustering in mRNA expression profiles , 2006, Comput. Biol. Medicine.

[8]  Philip M. Long,et al.  Boosting and Microarray Data , 2003, Machine Learning.

[9]  Peter J. Park,et al.  A Nonparametric Scoring Algorithm for Identifying Informative Genes from Microarray Data , 2000, Pacific Symposium on Biocomputing.

[10]  Duy-Dinh Le,et al.  A Multi-Stage Approach to Fast Face Detection , 2006, IEICE Trans. Inf. Syst..

[11]  Wenjia Wang,et al.  Enhancing Boosting by Feature Non-Replacement for Microarray Data Analysis , 2007, 2007 International Joint Conference on Neural Networks.

[12]  Nir Friedman,et al.  Tissue classification with gene expression profiles , 2000, RECOMB '00.

[13]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[14]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[15]  Aidong Zhang,et al.  Boost Feature Subset Selection: A New Gene Selection Algorithm for Microarray Dataset , 2006, International Conference on Computational Science.

[16]  Jae Won Lee,et al.  An extensive comparison of recent classification tools applied to microarray data , 2004, Comput. Stat. Data Anal..

[17]  Duy-Dinh Le,et al.  LI-008 Feature Selection By AdaBoost For SVM-Based Face Detection , 2004 .

[18]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.