Given the number of available genomic DNA, one now faces the task of identifying the functional parts of such raw sequence data, like the protein-coding regions. The gene prediction problem can be addressed in several ways. The most recently methods make use of the similarities between regions of two unannotated genomic sequences in order to find their genes. In this paper we present a new comparative-based heuristic to the gene prediction problem. It relies on a syntenic alignment of two genomic sequences. We have implemented the proposed heuristic in a computer program and confirmed its validity on a benchmark including 50 pairs of human and mouse genomic sequences.
[1]
Steven Salzberg,et al.
A method for identifying splice sites and translational start sites in eukaryotic mRNA
,
1997,
Comput. Appl. Biosci..
[2]
B. Berger,et al.
Human and Mouse Gene Structure: Comparative Analysis and Application to Exon Prediction
,
2000
.
[3]
Colin N. Dewey,et al.
Initial sequencing and comparative analysis of the mouse genome.
,
2002
.
[4]
R. Guigó,et al.
Evaluation of gene structure prediction programs.
,
1996,
Genomics.
[5]
Pierre Rouzé,et al.
Orphan gene finding - an exon assembly approach
,
2003,
Theor. Comput. Sci..