An artificial neural network method for combining gene prediction based on equitable weights

Gene prediction is still an important step to annotate genomes. In this paper, we proposed a novel method for recognizing gene in genomes. The method combines three famous gene-finding programs. After calculating the accuracy parameters, the equitable weight for each parameter is calculated using genetic algorithm. Then the integrative evaluation is performed. The integrative evaluation is employed to instruct the training of an artificial neural network. The simulation results show that the proposed method integrates advantages of three programs and the accuracy has an obvious improvement, which indicate that the proposed method has a powerful capability for gene prediction.

[1]  Alan K. Mackworth,et al.  Evaluation of gene-finding programs on mammalian sequences. , 2001, Genome research.

[2]  E. Uberbacher,et al.  Discovering and understanding genes in human DNA sequence using GRAIL. , 1996, Methods in enzymology.

[3]  V. Solovyev,et al.  Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames. , 1994, Nucleic acids research.

[4]  Alan K. Mackworth,et al.  Improving gene recognition accuracy by combining predictions from two gene-finding programs , 2002, Bioinform..

[5]  E. Snyder,et al.  Identification of protein coding regions in genomic DNA. , 1995, Journal of molecular biology.

[6]  Anders Krogh,et al.  Two Methods for Improving Performance of a HMM and their Application for Gene Finding , 1997, ISMB.

[7]  S. Salzberg,et al.  Improved microbial gene identification with GLIMMER. , 1999, Nucleic acids research.

[8]  Simon Cawley,et al.  Accurate identification of novel human genes through simultaneous gene prediction in human, mouse, and rat. , 2004, Genome research.

[9]  Michael Ruogu Zhang,et al.  Identification of protein coding regions in the human genome by quadratic discriminant analysis. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[10]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[11]  Vladimir Pavlovic,et al.  A Bayesian framework for combining gene predictions , 2002, Bioinform..

[12]  Melanie E. Goward,et al.  The DNA sequence of human chromosome 22 , 1999, Nature.

[13]  K. Murakami,et al.  Gene recognition by combination of several gene-finding programs , 1998, Bioinform..

[14]  Ramana V. Davuluri,et al.  Evaluation of gene prediction software using a genomic data set: application to <$O_SSF>Arabidopsis thaliana<$C_SSF>sequences , 1999, Bioinform..

[15]  P. Rouzé,et al.  Current methods of gene prediction, their strengths and weaknesses. , 2002, Nucleic acids research.

[16]  Edward C. Uberbacher,et al.  Automated Gene Identification in Large-Scale Genomic Sequences , 1997, J. Comput. Biol..

[17]  Robert C. Hopkins,et al.  Defining Genes in the Genome of the Hyperthermophilic Archaeon Pyrococcus furiosus: Implications for All Microbial Genomes , 2005, Journal of bacteriology.

[18]  R Zhang,et al.  Z curves, an intutive tool for visualizing and analyzing the DNA sequences. , 1994, Journal of biomolecular structure & dynamics.

[19]  Ying Xu,et al.  Reference-based gene model prediction on DNA contigs (extended abstract) , 1997, RECOMB '97.

[20]  D Haussler,et al.  Integrating database homology in a probabilistic gene structure model. , 1997, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.