Gene Prediction by Syntenic Alignment

Given the number of available genomic DNA, one now faces the task of identifying the functional parts of such raw sequence data, like the protein-coding regions. The gene prediction problem can be addressed in several ways. The most recently methods make use of the similarities between regions of two unannotated genomic sequences in order to find their genes. In this paper we present a new comparative-based heuristic to the gene prediction problem. It relies on a syntenic alignment of two genomic sequences. We have implemented the proposed heuristic in a computer program and confirmed its validity on a benchmark including 50 pairs of human and mouse genomic sequences.