Classifying gene expression data of cancer using classifier ensemble with mutually exclusive features

The explosion of DNA and protein sequence data in public and private databases has been encouraging interdisciplinary research on biology and information technology. Gene expression profiles are just sequences of numbers, and the necessity of tools analyzing them to get useful information has risen significantly. In order to predict the cancer class of patients from the gene expression profile, this paper presents a classification framework that combines a pair of classifiers trained with mutually exclusive features. The idea behind feature selection with nonoverlapping correlation is to encourage classifier ensemble, which consists of multiple classifiers, to learn different aspects of training data, so that classifiers can search in a wide solution space. Experimental results show that the classifier ensemble produces higher recognition accuracy than conventional classifiers.

[1]  Richard P. Lippmann,et al.  An introduction to computing with neural nets , 1987 .

[2]  R. Lippmann,et al.  An introduction to computing with neural nets , 1987, IEEE ASSP Magazine.

[3]  Josef Skrzypek,et al.  Synergy of Clustering Multiple Back Propagation Networks , 1989, NIPS.

[4]  J. Skrzypek,et al.  Dynamics of clustering multiple backpropagation networks , 1990, 1990 IEEE International Conference on Systems, Man, and Cybernetics Conference Proceedings.

[5]  Alexander H. Waibel,et al.  The Meta-Pi Network: Building Distributed Knowledge Representations for Robust Multisource Pattern Recognition , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Daniel N. Osherson,et al.  Modular learning , 1993 .

[7]  Roberto Battiti,et al.  Democracy in neural nets: Voting schemes for classification , 1994, Neural Networks.

[8]  Martin T. Hagan,et al.  Neural network design , 1995 .

[9]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[10]  P. Brown,et al.  Yeast microarrays for genome wide parallel genetic and gene expression analysis. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[11]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[12]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[14]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[15]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[17]  Xin Yao,et al.  Evolutionary ensembles with negative correlation learning , 2000, IEEE Trans. Evol. Comput..

[18]  Nir Friedman,et al.  Tissue classification with gene expression profiles , 2000, RECOMB '00.

[19]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[20]  Ming-Hsuan Yang,et al.  Gender classification with support vector machines , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[21]  Harry Wechsler,et al.  Mixture of experts for classification of gender, ethnic origin, and pose of human faces , 2000, IEEE Trans. Neural Networks Learn. Syst..

[22]  Rainer Spang,et al.  DNA Microarray Data Analysis and Regression Modeling for Genetic Expression Profiling , 2000 .

[23]  Wentian Li,et al.  How Many Genes are Needed for a Discriminant Microarray Data Analysis , 2001, physics/0104029.

[24]  Sung-Bae Cho,et al.  Towards Optimal Feature and Classifier for Gene Expression Classification of Cancer , 2002, AFSS.

[25]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[26]  Danh V. Nguyen,et al.  Tumor classification by partial least squares using microarray gene expression data , 2002, Bioinform..