EXONSCAN: EXON prediction with Signal detection and Coding region AligNment in homologous sequences

Identifying the protein coding genes in the genomic sequences is a very important application and challenging work. A great number of computational gene prediction programs have been proposed with satisfied sensitivity and specificity at nucleotide level. However, their sensitivity and specificity at exon level are often low. Here, we propose EXONSCAN, a novel exon prediction program that combines signal detection and CORAL (COding Region ALignment) between homologous genomic sequences with the conservation of protein coding regions. EXONSCAN first uses the signal detection and CORAL to find candidate exons. Then EXONSCAN predicts the gene structures by assembling predicted exons. In the experimental test, our program was tested on ROSETTA data set of 117 human-mouse sequence pairs. The experiment results show that the sensitivity and specificity of EXONSCAN are both 98% at nucleotide level; they are 87% and 89% at exon level, respectively. These results are superior to those of all existing programs.