Detection of polyadenylation signals in human DNA sequences.

We present polyadq, a program for detection of human polyadenylation signals. To avoid training on possibly flawed data, the development of polyadq began with a de novo characterization of human mRNA 3' processing signals. This information was used in training two quadratic discriminant functions that polyadq uses to evaluate potential polyA signals. In our tests, polyadq predicts polyA signals with a correlation coefficient of 0.413 on whole genes and 0.512 in the last two exons of genes, substantially outperforming other published programs on the same data set. polyadq is also the only program that is able to consistently detect the ATTAAA variant of the polyA signal.

[1]  Michael Ruogu Zhang,et al.  Statistical features of human exons and their flanking regions. , 1998, Human molecular genetics.

[2]  Nikolay A. Kolchanov,et al.  Construction of a generalized consensus matrix for recognition of vertebrate pre-mRNA 3'-terminal processing sites , 1994, Comput. Appl. Biosci..

[3]  Gary D. Stormo,et al.  Neural Networks for Determining Protein Specificity and Multiple Alignment of Binding Sites , 1994, ISMB.

[4]  H. Lou,et al.  Regulation of Alternative Polyadenylation by U1 snRNPs and SRp20 , 1998, Molecular and Cellular Biology.

[5]  C Saccone,et al.  Sequence analysis and compositional properties of untranslated regions of human mRNAs. , 1994, Gene.

[6]  J. Manley,et al.  Mechanism and regulation of mRNA polyadenylation. , 1997, Genes & development.

[7]  B. Cullen,et al.  Effect of RNA secondary structure on polyadenylation site selection. , 1991, Genes & development.

[8]  J. Fickett,et al.  Assessment of protein coding measures. , 1992, Nucleic acids research.

[9]  G. Pesole,et al.  Structural and compositional features of untranslated regions of eukaryotic mRNAs. , 1997, Gene.

[10]  William N. Venables,et al.  Modern Applied Statistics with S-Plus. , 1996 .

[11]  T. Dandekar,et al.  RNA Ligands Selected by Cleavage Stimulation Factor Contain Distinct Sequence Motifs That Function as Downstream Elements in 3′-End Processing of Pre-mRNA* , 1997, The Journal of Biological Chemistry.

[12]  Fan Chen,et al.  Sequence and position requirements for uridylate-rich downstream elements of polyadenylation signals , 1994, Nucleic Acids Res..

[13]  J. Manley,et al.  RNA recognition by the human polyadenylation factor CstF , 1997, Molecular and cellular biology.

[14]  J. Manley,et al.  The Polyadenylation Factor CstF-64 Regulates Alternative Processing of IgM Heavy Chain Pre-mRNA during B Cell Differentiation , 1996, Cell.

[15]  J. Wilusz,et al.  Cleavage site determinants in the mammalian polyadenylation signal. , 1995, Nucleic acids research.

[16]  E. Wahle,et al.  3'-end cleavage and polyadenylation of mRNA precursors. , 1995, Biochimica et biophysica acta.

[17]  Ying Xu,et al.  Detection of RNA Polymerase II Promoters and Polyadenylation Sites in Human DNA Sequence , 1996, Comput. Chem..

[18]  G. Edwalds-Gilbert,et al.  Regulation of poly(A) site use during mouse B-cell development involves a change in the binding of a general polyadenylation factor in a B-cell stage-specific manner , 1995, Molecular and cellular biology.

[19]  T. D. Schneider,et al.  Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[20]  J. McLauchlan,et al.  The consensus sequence YGTGTTYY located downstream from the AATAAA signal is required for efficient formation of mRNA 3' termini. , 1985, Nucleic acids research.

[21]  W. Keller,et al.  No end yet to messenger RNA 3′ processing! , 1995, Cell.

[22]  J W Fickett,et al.  Finding genes by computer: the state of the art. , 1996, Trends in genetics : TIG.

[23]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[24]  Michael Ruogu Zhang,et al.  Identification of protein coding regions in the human genome by quadratic discriminant analysis. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[25]  J. Claverie Computational methods for the identification of genes in vertebrate genomic sequences. , 1997, Human molecular genetics.

[26]  A. I.,et al.  Neural Field Continuum Limits and the Structure–Function Partitioning of Cognitive–Emotional Brain Networks , 2023, Biology.

[27]  Victor V. Solovyev,et al.  Recognition of 3'-processing sites of human mRNA precursors , 1997, Comput. Appl. Biosci..

[28]  Joseph R. Nevins,et al.  The HTLV-I rex response element mediates a novel form of mRNA polyadenylation , 1991, Cell.

[29]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.