论文信息 - Identifying DNA splice sites using hypernetworks with artificial molecular evolution

Identifying DNA splice sites using hypernetworks with artificial molecular evolution

Identifying DNA splice sites is a main task of gene hunting. We introduce the hyper-network architecture as a novel method for finding DNA splice sites. The hypernetwork architecture is a biologically inspired information processing system composed of networks of molecules forming cells, and a number of cells forming a tissue or organism. Its learning is based on molecular evolution. DNA examples taken from GenBank were translated into binary strings and fed into a hypernetwork for training. We performed experiments to explore the generalization performance of hypernetwork learning in this data set by two-fold cross validation. The hypernetwork generalization performance was comparable to well known classification algorithms. With the best hypernetwork obtained, including local information and heuristic rules, we built a system (HyperExon) to obtain splice site candidates. The HyperExon system outperformed leading splice recognition systems in the list of sequences tested.

Silvano Colombano | Jose L. Segovia-Juarez | Denise E. Kirschner

[1] Steven Salzberg,et al. Finding Genes in DNA with a Hidden Markov Model , 1997, J. Comput. Biol..

[2] P. Rouzé,et al. Current methods of gene prediction, their strengths and weaknesses. , 2002, Nucleic acids research.

[3] David B. Fogel,et al. Identification of Coding Regions in DNA Sequences Using Evolved Neural Networks , 2003 .

[4] Salvatore Rampone,et al. Recognition of splice junctions on DNA sequences by BRAIN learning algorithm , 1998, Bioinform..

[5] David Corne,et al. Evolutionary Computation In Bioinformatics , 2003 .

[6] L. Lanier,et al. Immune inhibitory receptors. , 2000, Science.

[7] S. Salzberg,et al. GeneSplicer: a new computational method for splice site prediction. , 2001, Nucleic acids research.

[8] Peter G. Korning,et al. Splice Site Prediction in Arabidopsis Thaliana Pre-mRNA by Combining Local and Global Sequence Information , 1996 .

[9] T. D. Schneider,et al. Features of spliceosome evolution and function inferred from an analysis of the information at human splice sites. , 1992, Journal of molecular biology.

[10] Edward C. Uberbacher,et al. GRAIL: a multi-agent neural network system for gene identification , 1996, Proc. IEEE.

[11] R. Durbin,et al. GAZE: a generic framework for the integration of gene-prediction data by dynamic programming. , 2002, Genome research.

[12] Jeffrey W. Roberts,et al. 遺伝子の分子生物学 = Molecular biology of the gene , 1970 .

[13] Jude W. Shavlik,et al. Training Knowledge-Based Neural Networks to Recognize Genes , 1990, NIPS.

[14] Yin Xu,et al. An Improved System for Exon Recognition and Gene Modeling in Human DNA Sequence , 1994, ISMB.

[15] Jude W. Shavlik,et al. Interpretation of Artificial Neural Networks: Mapping Knowledge-Based Neural Networks into Rules , 1991, NIPS.

[16] S. Rampone. Splice-junction recognition on gene sequences (DNA) by BRAIN learning algorithm , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[17] J.L. Segovia-Juarez,et al. Learning with the molecular-based hypernetwork model , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[18] E. Snyder,et al. Identification of protein coding regions in genomic DNA. , 1995, Journal of molecular biology.

[19] S. Knudsen,et al. Prediction of human mRNA donor and acceptor sites from the DNA sequence. , 1991, Journal of molecular biology.

[20] David Haussler,et al. A Generalized Hidden Markov Model for the Recognition of Human Genes in DNA , 1996, ISMB.

[21] LiMin Fu,et al. An expert network for DNA sequence analysis , 1999, IEEE Intell. Syst..

[22] David J. Spiegelhalter,et al. Machine Learning, Neural and Statistical Classification , 2009 .