A probabilistic model of 3' end formation in Caenorhabditis elegans.

The 3' ends of mRNAs terminate with a poly(A) tail. This post-transcriptional modification is directed by sequence features present in the 3'-untranslated region (3'-UTR). We have undertaken a computational analysis of 3' end formation in Caenorhabditis elegans. By aligning cDNAs that diverge from genomic sequence at the poly(A) tract, we accurately identified a large set of true cleavage sites. When there are many transcripts aligned to a particular locus, local variation of the cleavage site over a span of a few bases is frequently observed. We find that in addition to the well-known AAUAAA motif there are several regions with distinct nucleotide compositional biases. We propose a generalized hidden Markov model that describes sequence features in C.elegans 3'-UTRs. We find that a computer program employing this model accurately predicts experimentally observed 3' ends even when there are multiple AAUAAA motifs and multiple cleavage sites. We have made available a complete set of polyadenylation site predictions for the C.elegans genome, including a subset of 6570 supported by aligned transcripts.

[1]  C. Moore,et al.  Rna15 Interaction with the A-Rich Yeast Polyadenylation Signal Is an Essential Step in mRNA 3′-End Formation , 2001, Molecular and Cellular Biology.

[2]  Steven Salzberg,et al.  GlimmerM, Exonomy and Unveil: three ab initio eukaryotic genefinders , 2003, Nucleic Acids Res..

[3]  D. Gautheret,et al.  Sequence determinants in human polyadenylation site selection , 2003, BMC Genomics.

[4]  J. Manley,et al.  Mechanism and regulation of mRNA polyadenylation. , 1997, Genes & development.

[5]  Ian Korf,et al.  Gene finding in novel genomes , 2004, BMC Bioinformatics.

[6]  T. Blumenthal,et al.  A complex containing CstF-64 and the SL2 snRNP connects mRNA 3' end formation and trans-splicing in C. elegans operons. , 2001, Genes & development.

[7]  N. Proudfoot Genetic dangers in poly(A) signals , 2001, EMBO reports.

[8]  Mario Stanke,et al.  Gene prediction with a hidden Markov model and a new intron submodel , 2003, ECCB.

[9]  C. MacDonald,et al.  Reexamining the polyadenylation signal: were we wrong about AAUAAA? , 2002, Molecular and Cellular Endocrinology.

[10]  W. Keller,et al.  Recognition of polyadenylation sites in yeast pre‐mRNAs by cleavage and polyadenylation factor , 2001, The EMBO journal.

[11]  J. Spieth,et al.  Intercistronic Region Required for Polycistronic Pre-mRNA Processing in Caenorhabditis elegans , 2001, Molecular and Cellular Biology.

[12]  S. Chen,et al.  A specific RNA-protein interaction at yeast polyadenylation efficiency elements. , 1998, Nucleic acids research.

[13]  F. Sherman,et al.  3'-end-forming signals of yeast mRNA. , 1996, Trends in biochemical sciences.

[14]  David Haussler,et al.  A Generalized Hidden Markov Model for the Recognition of Human Genes in DNA , 1996, ISMB.

[15]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[16]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[17]  Jack E. Tabaska,et al.  Detection of polyadenylation signals in human DNA sequences. , 1999, Gene.

[18]  Thomas Blumenthal,et al.  RNA Processing and Gene Structure , 1997 .

[19]  J. Wilusz,et al.  Auxiliary downstream elements are required for efficient polyadenylation of mammalian pre-mRNAs. , 1998, Nucleic acids research.

[20]  Matthew R. Pocock,et al.  The Bioperl toolkit: Perl modules for the life sciences. , 2002, Genome research.

[21]  Temple F. Smith,et al.  Probabilistic prediction of Saccharomyces cerevisiae mRNA 3'-processing sites. , 2002, Nucleic acids research.

[22]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[23]  Victor V. Solovyev,et al.  Recognition of 3'-processing sites of human mRNA precursors , 1997, Comput. Appl. Biosci..

[24]  Marco M. Kessler,et al.  Hrp1, a sequence-specific RNA-binding protein that shuttles between the nucleus and the cytoplasm, is required for mRNA 3'-end formation in yeast. , 1997, Genes & development.

[25]  Jing Zhao,et al.  Formation of mRNA 3′ Ends in Eukaryotes: Mechanism, Regulation, and Interrelationships with Other Steps in mRNA Synthesis , 1999, Microbiology and Molecular Biology Reviews.