Selecting informative genes with parallel genetic algorithms in tissue classification.

Recent advances in biotechnology offer the ability to measure the levels of expression of thousands of genes in parallel. Analysis of such data can provide understanding and insight into gene function and regulatory mechanisms. Several machine learning approaches have been used to aid to understand the functions of genes. However, these tasks are made more difficult due to the noisy nature of array data and the overwhelming number of gene features. In this paper, we use the parallel genetic algorithm to filter out the informative genes relative to classification. By combing with the classification method proposed by Golub et al. and Slonim et al., we classify the data sets with tissues of different classes, and the preliminary results are presented in this paper.

[1]  Peter J. Park,et al.  A Nonparametric Scoring Algorithm for Identifying Informative Genes from Microarray Data , 2000, Pacific Symposium on Biocomputing.

[2]  Nir Friedman,et al.  Class discovery in gene expression data , 2001, RECOMB.

[3]  Nir Friedman,et al.  Tissue classification with gene expression profiles. , 2000 .

[4]  Jill P. Mesirov,et al.  Class prediction and discovery using gene expression data , 2000, RECOMB '00.

[5]  Nir Friedman,et al.  Scoring Genes for Relevance , 2000 .

[6]  Laurie J. Heyer,et al.  Exploring expression data: identification and analysis of coexpressed genes. , 1999, Genome research.

[7]  F. Baas,et al.  The Human Transcriptome Map: Clustering of Highly Expressed Genes in Chromosomal Domains , 2001, Science.

[8]  Walter L. Ruzzo,et al.  Bayesian Classification of DNA Array Expression Data , 2000 .

[9]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[10]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[11]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[12]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Masahiro Okamoto,et al.  Development of a System for the Inference of Large Scale Genetic Networks , 2000, Pacific Symposium on Biocomputing.

[14]  Zohar Yakhini,et al.  Clustering gene expression patterns , 1999, J. Comput. Biol..

[15]  Lucila Ohno-Machado,et al.  Unsupervised Learning from Complex Data: The Matrix Incision Tree Algorithm , 2001, Pacific Symposium on Biocomputing.

[16]  Gary D. Stormo,et al.  Modeling Regulatory Networks with Weight Matrices , 1998, Pacific Symposium on Biocomputing.

[17]  Jerzy W. Bala,et al.  Hybrid Learning Using Genetic Algorithms and Decision Trees for Pattern Classification , 1995, IJCAI.

[18]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[19]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.