Comparative analysis of base correlations in 5' untranslated regions of various species.

Translational initiation signals, such as Shine-Dalgarno (SD) sequences in bacteria and Kozak consensus sequences in vertebrates, direct ribosomes to initiate protein synthesis from mRNAs. Investigating sequence characteristics of these signals is important, particularly to infer translational initiation mechanisms. Although various statistical analyses of translational initiation signals have been done, few have focused on base correlations that assess base dependencies in the signal sequences. We used relative entropy and mutual information to analyze base conservation and correlation, respectively, in the 5' UTRs of various species. In eukaryotes, we found peaks of relative entropy at -3 from the translational start site but no peak of mutual information at that position, indicating that the base at that position (known as the core base of the Kozak sequence) is well conserved but not correlated with neighboring bases and thus functions as a single base. We observed unexpected peaks of mutual information between positions -2 and -1 in most eukaryotes. Surprisingly these base correlation also occurred in some bacteria and archaea, although there were no base preferences at neither position. Various dinucleotide patterns existed at these positions, and the correlation between bases at -2 and -1 may be relevant to the context of translational initiation. Because dinucleotide patterns of correlated pairs of nucleotides at -2 and -1 were not unique within respective organisms, the correlation could not be found when analyzing single-nucleotide conservation. Therefore, mutual information allowed us to discover signals that were not found by simply analyzing base conservation.

[1]  M. Borodovsky,et al.  Leaderless transcripts of the crenarchaeal hyperthermophile Pyrobaculum aerophilum. , 2001, Journal of molecular biology.

[2]  J. Steitz,et al.  Conservation of the primary structure at the 3′ end of 18S rRNA from eucaryotic cells , 1978, Cell.

[3]  M Bjerknes,et al.  Determination of the optimal aligned spacing between the Shine-Dalgarno sequence and the translation initiation codon of Escherichia coli mRNAs. , 1994, Nucleic acids research.

[4]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[5]  J. Lagunez-Otero rRNA-mRNA complementarity: implications for translation initiation. , 1993, Trends in biochemical sciences.

[6]  Giovanna Ambrosini,et al.  Signal search analysis server , 2003, Nucleic Acids Res..

[7]  J. Shine,et al.  The 3'-terminal sequence of Escherichia coli 16S ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. , 1974, Proceedings of the National Academy of Sciences of the United States of America.

[8]  P. Bucher Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. , 1990, Journal of molecular biology.

[9]  D. Bitzer,et al.  Coding theory based models for protein translation initiation in prokaryotic organisms. , 2004, Bio Systems.

[10]  T. D. Schneider,et al.  Information content of individual genetic sequences. , 1997, Journal of theoretical biology.

[11]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[12]  M. Kozak An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs. , 1987, Nucleic acids research.

[13]  G. Edelman,et al.  rRNA complementarity within mRNAs: a possible basis for mRNA-ribosome interactions and translational control. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[14]  T. D. Schneider,et al.  Features of spliceosome evolution and function inferred from an analysis of the information at human splice sites. , 1992, Journal of molecular biology.

[15]  H. Margalit,et al.  Identification and characterization of E.coli ribosomal binding sites by free energy computation. , 1993, Nucleic acids research.

[16]  M. Kozak Adherence to the first-AUG rule when a second AUG codon follows closely upon the first. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[17]  L. Gold,et al.  Influence of mRNA determinants on translation initiation in Escherichia coli. , 1991, Journal of molecular biology.

[18]  O. Jean-Jean,et al.  Complementarity between the mRNA 5' untranslated region and 18S ribosomal RNA can inhibit translation. , 2000, RNA.

[19]  Artemis G. Hatzigeorgiou,et al.  Translation initiation start prediction in human cDNAs with high accuracy , 2002, Bioinform..

[20]  Steven Salzberg,et al.  A method for identifying splice sites and translational start sites in eukaryotic mRNA , 1997, Comput. Appl. Biosci..

[21]  A. Fuglsang,et al.  Compositional nonrandomness upstream of start codons in archaebacteria. , 2004, Gene.

[22]  Masaru Tomita,et al.  Analysis of base-pairing potentials between 16S rRNA and 5' UTR for translation initiation in various prokaryotes , 1999, Bioinform..

[23]  Jan van Duin,et al.  Translational initiation on structured messengers : another role for the Shine-Dalgarno interaction , 1994 .

[24]  D. Sargan,et al.  A possible novel interaction between the 3′‐end of 18 S ribosomal RNA and the 5'‐leader sequence of many eukaryotic messenger RNAs , 1982, FEBS letters.

[25]  Gunnar Rätsch,et al.  Engineering Support Vector Machine Kerneis That Recognize Translation Initialion Sites , 2000, German Conference on Bioinformatics.

[26]  T. D. Schneider,et al.  Information content of binding sites on nucleotide sequences. , 1986, Journal of molecular biology.

[27]  M. Kozak,et al.  Pushing the limits of the scanning mechanism for initiation of translation , 2002, Gene.

[28]  G. Stormo Information content and free energy in DNA--protein interactions. , 1998, Journal of theoretical biology.

[29]  E. Maxwell,et al.  Evidence for a Competitive‐Displacement Model for the initiation of protein synthesis involving the intermolecular hybridization of 5 S rRNA, 18 S rRNA and mRNA , 1991, FEBS letters.

[30]  M. Kozak Initiation of translation in prokaryotes and eukaryotes. , 1999, Gene.

[31]  G. Stormo,et al.  Translation initiation in Escherichia coli: sequences within the ribosome‐binding site , 1992, Molecular microbiology.

[32]  T. D. Schneider,et al.  Anatomy of Escherichia coli ribosome binding sites. , 2001, Journal of molecular biology.

[33]  G. Edelman,et al.  rRNA-like sequences occur in diverse primary transcripts: implications for the control of gene expression. , 1997, Proceedings of the National Academy of Sciences of the United States of America.