Anatomy of Escherichia coli ribosome binding sites.

During translational initiation in prokaryotes, the 3' end of the 16S rRNA binds to a region just upstream of the initiation codon. The relationship between this Shine-Dalgarno (SD) region and the binding of ribosomes to translation start-points has been well studied, but a unified mathematical connection between the SD, the initiation codon and the spacing between them has been lacking. Using information theory, we constructed a model that treats these three components uniformly by assigning to the SD and the initiation region (IR) conservations in bits of information, and by assigning to the spacing an uncertainty, also in bits. To build the model, we first aligned the SD region by maximizing the information content there. The ease of this process confirmed the existence of the SD pattern within a set of 4122 reviewed and revised Escherichia coli gene starts. This large data set allowed us to show graphically, by sequence logos, that the spacing between the SD and the initiation region affects both the SD site conservation and its pattern. We used the aligned SD, the spacing, and the initiation region to model ribosome binding and to identify gene starts that do not conform to the ribosome binding site model. A total of 569 experimentally proven starts are more conserved (have higher information content) than the full set of revised starts, which probably reflects an experimental bias against the detection of gene products that have inefficient ribosome binding sites. Models were refined cyclically by removing non-conforming weak sites. After this procedure, models derived from either the original or the revised gene start annotation were similar. Therefore, this information theory-based technique provides a method for easily constructing biologically sensible ribosome binding site models. Such models should be useful for refining gene-start predictions of any sequenced bacterial genome.

[1]  V. Ramakrishnan,et al.  Structure of a bacterial 30S ribosomal subunit at 5.5 Å resolution , 1999, Nature.

[2]  M. Kozak Initiation of translation in prokaryotes and eukaryotes. , 1999, Gene.

[3]  Kenneth E. Rudd,et al.  EcoGene: a genome sequence database for Escherichia coli K-12 , 2000, Nucleic Acids Res..

[4]  I. V. Boni,et al.  Ribosome-messenger recognition: mRNA target sites for ribosomal protein S1 , 1991, Nucleic Acids Res..

[5]  L. Gold,et al.  Influence of mRNA determinants on translation initiation in Escherichia coli. , 1991, Journal of molecular biology.

[6]  R. Brimacombe,et al.  The cross-link from the upstream region of mRNA to ribosomal protein S7 is located in the C-terminal peptide: experimental verification of a prediction from modeling studies. , 1999, RNA.

[7]  M. Borodovsky,et al.  GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. , 2001, Nucleic acids research.

[8]  Sybil P. Parker,et al.  McGraw-Hill encyclopedia of physics , 1983 .

[9]  M. Gelfand,et al.  Starts of bacterial genes: estimating the reliability of computer predictions. , 1999, Gene.

[10]  Richard Brimacombe,et al.  The Database of Ribosomal Cross links (DRC) , 1998, Nucleic Acids Res..

[11]  M. Sørensen,et al.  Ribosomal protein S1 is required for translation of most, if not all, natural mRNAs in Escherichia coli in vivo. , 1998, Journal of molecular biology.

[12]  P. Wollenzien,et al.  Arrangement of messenger RNA on Escherichia coli ribosomes with respect to 10 16S rRNA cross-linking sites. , 1994, Biochemistry.

[13]  T. D. Schneider,et al.  Features of spliceosome evolution and function inferred from an analysis of the information at human splice sites. , 1992, Journal of molecular biology.

[14]  T. D. Schneider,et al.  Information analysis of sequences that bind the replication initiator RepA. , 1993, Journal of molecular biology.

[15]  T. D. Schneider,et al.  Sequence walkers: a graphical method to display how binding proteins interact with DNA or RNA sequences. , 1997, Nucleic acids research.

[16]  T. D. Schneider,et al.  Using sequence logos and information analysis of Lrp DNA binding sites to investigate discrepancies between natural selection and SELEX. , 1999, Nucleic acids research.

[17]  T. D. Schneider,et al.  Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli. , 1982, Nucleic acids research.

[18]  M. Smit,et al.  Secondary structure of the ribosome binding site determines translational efficiency: a quantitative analysis. , 1990 .

[19]  T. D. Schneider,et al.  Information analysis of human splice site mutations , 1998, Human mutation.

[20]  J. Shine,et al.  The 3'-terminal sequence of Escherichia coli 16S ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. , 1974, Proceedings of the National Academy of Sciences of the United States of America.

[21]  H. Noller,et al.  Ribosomes and translation. , 1997, Annual review of biochemistry.

[22]  T. D. Schneider,et al.  Sequence logos, machine/channel capacity, Maxwell's demon, and molecular computers: a review of the theory of molecular machines , 1994 .

[23]  G. Stormo,et al.  Translation initiation in Escherichia coli: sequences within the ribosome‐binding site , 1992, Molecular microbiology.

[24]  T. D. Schneider,et al.  Quantitative analysis of ribosome binding sites in E.coli. , 1994, Nucleic acids research.

[25]  T. D. Schneider,et al.  Theory of molecular machines. I. Channel capacity of molecular machines. , 1991, Journal of theoretical biology.

[26]  D W Hukins,et al.  Optimised parameters for RNA double-helices. , 1972, Biochemical and biophysical research communications.

[27]  T. Steitz,et al.  The complete atomic structure of the large ribosomal subunit at 2.4 A resolution. , 2000, Science.

[28]  T. D. Schneider,et al.  Characterization of Translational Initiation Sites in E. Coui , 1982 .

[29]  P. Baldi,et al.  Naturally occurring nucleosome positioning signals in human exons and introns. , 1996, Journal of molecular biology.

[30]  T. D. Schneider,et al.  Theory of molecular machines. II. Energy dissipation from molecular machines. , 1991, Journal of theoretical biology.

[31]  T. D. Schneider,et al.  Reading of DNA sequence logos: prediction of major groove binding by information theory. , 1996, Methods in enzymology.

[32]  R. Brimacombe,et al.  The location of mRNA in the ribosomal 30S initiation complex; site‐directed cross‐linking of mRNA analogues carrying several photo‐reactive labels simultaneously on either side of the AUG start codon. , 1991, EMBO Journal.

[33]  C. Yanofsky,et al.  Transcription attenuation. , 1988, The Journal of biological chemistry.

[34]  C. Vonrhein,et al.  Structure of the 30S ribosomal subunit , 2000, Nature.

[35]  L. Gold,et al.  Posttranscriptional regulatory mechanisms in Escherichia coli. , 1988, Annual review of biochemistry.

[36]  M Bjerknes,et al.  Determination of the optimal aligned spacing between the Shine-Dalgarno sequence and the translation initiation codon of Escherichia coli mRNAs. , 1994, Nucleic acids research.

[37]  T. D. Schneider,et al.  Information analysis of Fis binding sites. , 1997, Nucleic acids research.

[38]  T. D. Schneider,et al.  Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[39]  R. Brimacombe,et al.  Contacts between 16S ribosomal RNA and mRNA, within the spacer region separating the AUG initiator codon and the Shine-Dalgarno sequence; a site-directed cross-linking study. , 1994, Nucleic acids research.

[40]  Mikhail S. Gelfand,et al.  Combining diverse evidence for gene recognition in completely sequenced bacterial genomes , 1998, German Conference on Bioinformatics.

[41]  N. W. Davis,et al.  The complete genome sequence of Escherichia coli K-12. , 1997, Science.

[42]  Thomas D. Schneider,et al.  OxyR and SoxRS Regulation offur , 1999, Journal of bacteriology.

[43]  M. Springer,et al.  Discrimination by Escherichia coli initiation factor IF3 against initiation on non-canonical codons relies on complementarity rules. , 1999, Journal of molecular biology.

[44]  T. D. Schneider,et al.  Interdependence of the position and orientation of SoxS binding sites in the transcriptional activation of the class I subset of Escherichia coli superoxide‐inducible promoters , 1999, Molecular microbiology.

[45]  T. D. Schneider,et al.  Information content of binding sites on nucleotide sequences. , 1986, Journal of molecular biology.

[46]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[47]  V. Ramakrishnan,et al.  The structure of a bacterial 30S ribosomal subunit , 2000 .

[48]  F. Neidhart Escherichia coli and Salmonella. , 1996 .

[49]  S. TD.,et al.  Information Content of Individual Genetic Sequences , 1998 .

[50]  L. Gold,et al.  Detection of Escherichia coli ribosome binding at translation initiation sites in the absence of tRNA. , 1991, Journal of molecular biology.

[51]  M. Tribus Thermostatics and thermodynamics , 1961 .

[52]  C. Gualerzi,et al.  Selection of the mRNA translation initiation region by Escherichia coli ribosomes. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[53]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[54]  Thomas D. Schneider,et al.  Fast Multiple Alignment of Ungapped DNA Sequences Using Information Theory and a Relaxation Method , 1996, Discret. Appl. Math..

[55]  H. Gassen,et al.  Molecular biology of pyridine nucleotide biosynthesis in Escherichia coli. Cloning and characterization of quinolinate synthesis genes nadA and nadB. , 1988, European journal of biochemistry.

[56]  G. Stormo,et al.  Translational initiation in prokaryotes. , 1981, Annual review of microbiology.

[57]  Richard Brimacombe,et al.  The Database of Ribosomal Cross-links: an update , 1999, Nucleic Acids Res..

[58]  S. Ringquist,et al.  Identification of an Intragenic Ribosome Binding Site That Affects Expression of the uncB Gene of the Escherichia coli Proton-Translocating ATPase (unc) Operon , 1998, Journal of bacteriology.

[59]  Jeffrey H. Miller A Short Course in Bacterial Genetics: A Laboratory Manual and Handbook for Escherichia coli and Rela , 1992 .

[60]  T. Steitz,et al.  The structural basis of ribosome activity in peptide bond synthesis. , 2000, Science.