Genetic Algorithms for Gene Expression Analysis

The major problem for current gene expression analysis techniques is how to identify the handful of genes which contribute to a disease from the thousands of genes measured on gene chips (microarrays). The use of a novel neural-genetic hybrid algorithm for gene expression analysis is described here. The genetic algorithm identifies possible gene combinations for classification and then uses the output from a neural network to determine the fitness of these combinations. Normal mutation and crossover operations are used to find increasingly fit combinations. Experiments on artificial and real-world gene expression databases are reported. The results from the algorithm are also explored for biological plausibility and confirm that the algorithm is a powerful alternative to standard data mining techniques in this domain.

[1]  C. Croce,et al.  Nucleotide sequence analysis of human abl and bcr-abl cDNAs. , 1989, Oncogene.

[2]  Hitoshi Iba,et al.  Inference of gene regulatory model by genetic algorithms , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[3]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[4]  M. Su,et al.  Multi-domain gating network for classification of cancer cells using gene expression data , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[5]  Björn Olsson,et al.  Artificial intelligence techniques for bioinformatics. , 2002, Applied bioinformatics.

[6]  David Page Comparative Data Mining for Microarrays : A Case Study Based on Multiple Myeloma , 2002 .

[7]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[8]  Sung-Bae Cho,et al.  Gene expression classification using optimal feature/classifier ensemble with negative correlation , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[9]  Georgios C. Anagnostopoulos,et al.  Tissue classification through analysis of gene expression data using a new family of ART architectures , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[10]  B. Seed,et al.  Isolation of a cDNA encoding CD33, a differentiation antigen of myeloid progenitor cells. , 1988, Journal of immunology.