Algorithms for splicing junction donor recognition in genomic DNA sequences

The consensus sequences at splicing junctions in genomic DNA are required for pre-mRNA breaking and rejoining which must be carried out precisely. Programs currently available for identification or prediction of transcribed sequences from within genomic DNA are far from being powerful enough to elucidate genomic structure completely. In this paper, we develop pattern matching algorithms for 5' splicing site (donor site) recognition. Using the Motif models we develop, we can extract the degenerate pattern information from the consensus splicing junction sequences. The experimental results show that, our algorithm could correctly recognize 93% of the total donor sites at the right positions in the test DNA group. Furthermore, more than 91% of the donor sites were correctly predicted by our algorithm. These precision rates are higher than the best existing donor classification algorithm.