Detection of RNA Polymerase II Promoters and Polyadenylation Sites in Human DNA Sequence

Detection of RNA polymerase II promoters and polyadenylation sites helps to locate gene boundaries and can enhance accurate gene recognition and modeling in genomic DNA sequence. We describe a system which can be used to detect polyadenylation sites and thus delineate the 3' boundary of a gene, and discuss improvements to a system first described in Matis et al. (1995) [Matis S., Shah M., Mural R. J. & Uberbacher E.C. (1995) Proc. First Wrld Conf. Computat. Med., Public Hlth, Biotechnol. (Wrld Sci.) (in press).], which predicts a large subset of RNA polymerase II promoters. The promoter system used statistical matrices and distance information as inputs for a neural network which was trained to provide initial promoter recognition. The output of the network was further refined by applying rules which use the gene context information predicted by GRAIL. We have reconstructed the rule-based system which uses gene context information and significantly improved the sensitivity and selectivity of promoter detection.