Gene expression analysis : Joint feature selection and classifier design

Recently developed high-throughput technologies—including oligonucleotide arrays (Lockhart et al., 1996), DNA microarrays (Schena et al., 1995), and SAGE (Vel-culescu et al., 1995)—enable us to simultaneously quantify the expression levels of thousands of genes in a population of cells. As one application of these technologies , gene expression profiles can be generated from a collection of cancerous and non-cancerous tumor tissue samples and then stored in a database. Kernel methods like the support vector machine (SVM) and the relevance vector machine (RVM) have been shown to accurately predict the disease status of an undiagnosed patient by statistically comparing his or her profile of gene expression levels against a database of profiles from diagnosed patients (Golub et al. Despite this early success , the presence of a significant number of irrelevant features—here genes in the profile that are unrelated to the disease status of the tissue—makes such analysis somewhat prone to the curse of dimensionality. Intuitively, overcoming the curse of dimensionality requires that we build clas-sifiers relying on information exclusively from the genes in the profile that are truly relevant to the disease status of the tissue. This problem of identifying the features most relevant to the classification task is known as feature selection. In this chapter, we review current methods of feature selection, focusing especially on the many recent results that have been reported in the context of gene expression analysis. Then we present a new Bayesian EM algorithm that jointly accomplishes

[1]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[2]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[3]  D. Lockhart,et al.  Expression monitoring by hybridization to high-density oligonucleotide arrays , 1996, Nature Biotechnology.

[4]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[5]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[6]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[7]  David Barber,et al.  Bayesian Classification With Gaussian Processes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Olvi L. Mangasarian,et al.  Arbitrary-norm separating plane , 1999, Oper. Res. Lett..

[9]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[10]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Matthias W. Seeger,et al.  Bayesian Model Selection for Support Vector Machines, Gaussian Processes and Other Kernel Classifiers , 1999, NIPS.

[12]  Sayan Mukherjee,et al.  Feature Selection for SVMs , 2000, NIPS.

[13]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[14]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[15]  Nir Friedman,et al.  Tissue classification with gene expression profiles , 2000, RECOMB '00.

[16]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[17]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[18]  Anil K. Jain,et al.  Bayesian learning of sparse classifiers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[19]  E. Boerwinkle,et al.  Feature (gene) selection in gene expression-based tumor classification. , 2001, Molecular genetics and metabolism.

[20]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[21]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Yi Li,et al.  Bayesian automatic relevance determination algorithms for classifying gene expression data. , 2002, Bioinformatics.

[23]  Ji Huang,et al.  [Serial analysis of gene expression]. , 2002, Yi chuan = Hereditas.

[24]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Alain Rakotomamonjy,et al.  Variable Selection Using SVM-based Criteria , 2003, J. Mach. Learn. Res..

[26]  Mário A. T. Figueiredo Adaptive Sparseness for Supervised Learning , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Lawrence Carin,et al.  An EM Algorithm for Joint Feature Selection and Classifier Design 1 , 2003 .

[28]  Bernhard Schölkopf,et al.  Use of the Zero-Norm with Linear Models and Kernel Methods , 2003, J. Mach. Learn. Res..

[29]  Ji Zhu,et al.  Margin Maximizing Loss Functions , 2003, NIPS.

[30]  T. Hastie,et al.  Classification of gene microarrays by penalized logistic regression. , 2004, Biostatistics.

[31]  Glenn Fung,et al.  A Feature Selection Newton Method for Support Vector Machine Classification , 2004, Comput. Optim. Appl..

[32]  Volker Roth,et al.  The generalized LASSO , 2004, IEEE Transactions on Neural Networks.

[33]  Lawrence Carin,et al.  Joint Classifier and Feature Optimization for Comprehensive Cancer Diagnosis Using Gene Expression Data , 2004, J. Comput. Biol..