Mathematical modeling of translation initiation for the estimation of its efficiency to computationally design mRNA sequences with desired expression levels in prokaryotes

BackgroundWithin the emerging field of synthetic biology, engineering paradigms have recently been used to design biological systems with novel functionalities. One of the essential challenges hampering the construction of such systems is the need to precisely optimize protein expression levels for robust operation. However, it is difficult to design mRNA sequences for expression at targeted protein levels, since even a few nucleotide modifications around the start codon may alter translational efficiency and dramatically (up to 250-fold) change protein expression. Previous studies have used ad hoc approaches (e.g., random mutagenesis) to obtain the desired translational efficiencies for mRNA sequences. Hence, the development of a mathematical methodology capable of estimating translational efficiency would greatly facilitate the future design of mRNA sequences aimed at yielding desired protein expression levels.ResultsWe herein propose a mathematical model that focuses on translation initiation, which is the rate-limiting step in translation. The model uses mRNA-folding dynamics and ribosome-binding dynamics to estimate translational efficiencies solely from mRNA sequence information. We confirmed the feasibility of our model using previously reported expression data on the MS2 coat protein. For further confirmation, we used our model to design 22 luxR mRNA sequences predicted to have diverse translation efficiencies ranging from 10-5 to 1. The expression levels of these sequences were measured in Escherichia coli and found to be highly correlated (R2= 0.87) with their estimated translational efficiencies. Moreover, we used our computational method to successfully transform a low-expressing DsRed2 mRNA sequence into a high-expressing mRNA sequence by maximizing its translational efficiency through the modification of only eight nucleotides upstream of the start codon.ConclusionsWe herein describe a mathematical model that uses mRNA sequence information to estimate translational efficiency. This model could be used to design best-fit mRNA sequences having a desired protein expression level, thereby facilitating protein over-production in biotechnology or the protein expression-level optimization necessary for the construction of robust networks in synthetic biology.

[1]  S. Gottesman The small RNA regulators of Escherichia coli: roles and mechanisms*. , 2004, Annual review of microbiology.

[2]  Zhigang Tian,et al.  mRNA secondary structure at start AUG codon is a key limiting factor for human protein expression in Escherichia coli. , 2006, Biochemical and biophysical research communications.

[3]  G. Culver,et al.  Meanderings of the mRNA through the ribosome. , 2001, Structure.

[4]  H. Noller,et al.  mRNA Helicase Activity of the Ribosome , 2005, Cell.

[5]  Shigeyuki Yokoyama,et al.  A snapshot of the 30S ribosomal subunit capturing mRNA via the Shine-Dalgarno interaction. , 2007, Structure.

[6]  H O Smith,et al.  Use of synthetic ribosome binding site for overproduction of the 5B protein of insertion sequence IS5. , 1989, Nucleic acids research.

[7]  Marc Dreyfus,et al.  AU-Rich Sequences within 5′ Untranslated Leaders Enhance Translation and Stabilize mRNA in Escherichia coli , 2005, Journal of bacteriology.

[8]  Udo Oppermann,et al.  Codon optimization can improve expression of human genes in Escherichia coli: A multi-gene study. , 2008, Protein expression and purification.

[9]  P. H. Van Knippenberg,et al.  Secondary structure as primary determinant of the efficiency of ribosomal binding sites in Escherichia coli , 1986, Nucleic Acids Res..

[10]  Michael Zuker,et al.  Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information , 1981, Nucleic Acids Res..

[11]  N. Friedman,et al.  Stochastic protein expression in individual cells at the single molecule level , 2006, Nature.

[12]  Dan Ferber,et al.  Microbes Made to Order , 2004, Science.

[13]  Michael Zuker,et al.  DINAMelt web server for nucleic acid melting prediction , 2005, Nucleic Acids Res..

[14]  Christopher A. Voigt,et al.  Environmentally controlled invasion of cancer cells by engineered bacteria. , 2006, Journal of molecular biology.

[15]  D. Turner,et al.  Improved predictions of secondary structures for RNA. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Arkady B. Khodursky,et al.  Global analysis of mRNA decay and abundance in Escherichia coli at single-gene resolution using two-color fluorescent DNA microarrays , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[17]  M. Dreyfus,et al.  Translation initiation in Escherichia coli: old and new questions , 1990, Molecular microbiology.

[18]  D. Mathews Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization. , 2004, RNA.

[19]  Philip Ball,et al.  Synthetic biology: Starting from scratch , 2004, Nature.

[20]  Yuri A Berlin,et al.  Artificial genetic selection for an efficient translation initiation site for expression of human RACK1 gene in Escherichia coli. , 2004, Nucleic acids research.

[21]  B. S. Laursen,et al.  Initiation of Protein Synthesis in Bacteria , 2005, Microbiology and Molecular Biology Reviews.

[22]  J. van Duin,et al.  Translational control by a long range RNA-RNA interaction; a basepair substitution analysis. , 1993, Nucleic acids research.

[23]  J. Sninsky,et al.  Effects of alterations in the translation control region on bacterial gene expression: use of cat gene constructs transcribed from the lac promoter as a model system. , 1984, Gene.

[24]  G Wang,et al.  High-level expression of prochymosin in Escherichia coli: effect of the secondary structure of the ribosome binding site. , 1995, Protein expression and purification.

[25]  M. P. Jackson,et al.  Roles of a ribosome-binding site and mRNA secondary structure in differential expression of Shiga toxin genes , 1993, Journal of bacteriology.

[26]  J. Gralla,et al.  Productive and abortive initiation of transcription in vitro at the lac UV5 promoter. , 1980, Biochemistry.

[27]  Christopher V. Rao,et al.  Computational design of orthogonal ribosomes , 2008, Nucleic acids research.

[28]  H. Margalit,et al.  Identification and characterization of E.coli ribosomal binding sites by free energy computation. , 1993, Nucleic acids research.

[29]  Sotaro Uemura,et al.  Peptide bond formation destabilizes Shine–Dalgarno interaction on the ribosome , 2007, Nature.

[30]  Jeffrey H. Miller Experiments in molecular genetics , 1972 .

[31]  W. Fiers,et al.  Nucleotide Sequence of the Gene Coding for the Bacteriophage MS2 Coat Protein , 1972, Nature.

[32]  Marek Kimmel,et al.  Transcriptional stochasticity in gene expression. , 2006, Journal of theoretical biology.

[33]  Roger Brent,et al.  A partnership between biology and engineering , 2004, Nature Biotechnology.

[34]  J. van Duin,et al.  Secondary structure of the ribosome binding site determines translational efficiency: a quantitative analysis. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[35]  G. Stormo,et al.  Translation initiation in Escherichia coli: sequences within the ribosome‐binding site , 1992, Molecular microbiology.

[36]  J. van Duin,et al.  Translational initiation on structured messengers. Another role for the Shine-Dalgarno interaction. , 1994, Journal of molecular biology.

[37]  Dan Ferber,et al.  Synthetic biology. Microbes made to order. , 2004, Science.

[38]  Frances H Arnold,et al.  Synthetic gene circuits: design with directed evolution. , 2007, Annual review of biophysics and biomolecular structure.

[39]  Masaru Tomita,et al.  Analysis of base-pairing potentials between 16S rRNA and 5' UTR for translation initiation in various prokaryotes , 1999, Bioinform..

[40]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[41]  Madalena Chaves,et al.  Robustness and fragility of Boolean models for genetic regulatory networks. , 2005, Journal of theoretical biology.

[42]  M. Takanami,et al.  AN ESTIMATE OF THE SIZE OF THE RIBOSOMAL SITE FOR MESSENGER RNA BINDING. , 1964, Proceedings of the National Academy of Sciences of the United States of America.

[43]  R. Weiss,et al.  Directed evolution of a genetic circuit , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[44]  S. Altuvia,et al.  Alternative mRNA structures of the cIII gene of bacteriophage lambda determine the rate of its translation initiation. , 1989, Journal of molecular biology.

[45]  S. Makrides Strategies for achieving high-level expression of genes in Escherichia coli , 1996 .

[46]  W. Fiers,et al.  Secondary structure of mRNA and efficiency of translation initiation. , 1980, Gene.

[47]  Reinhard Wolf,et al.  Coding-Sequence Determinants of Gene Expression in Escherichia coli , 2009 .

[48]  S Ringquist,et al.  Nature of the ribosomal mRNA track: analysis of ribosome-binding sites containing different sequences and secondary structures. , 1993, Biochemistry.

[49]  H C Lim,et al.  Regulation of ribosome synthesis in Escherichia coli: Effects of temperature and dilution rate changes , 2000, Biotechnology and bioengineering.

[50]  G. Church,et al.  Global RNA half-life analysis in Escherichia coli reveals positional patterns of transcript degradation. , 2003, Genome research.

[51]  Brian F. Pfleger,et al.  Optimization of DsRed production in Escherichia coli: Effect of ribosome binding site sequestration on translation efficiency , 2005, Biotechnology and bioengineering.

[52]  M. Inouye,et al.  Mutations upstream of the ribosome-binding site affect translational efficiency. , 1985, Journal of molecular biology.

[53]  M P Deutscher,et al.  A uridine-rich sequence required for translation of prokaryotic mRNA. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[54]  J. Shine,et al.  The 3'-terminal sequence of Escherichia coli 16S ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. , 1974, Proceedings of the National Academy of Sciences of the United States of America.

[55]  D. S. Lee,et al.  Search for the optimal sequence of the ribosome binding site by random oligonucleotide-directed mutagenesis. , 1988, Nucleic acids research.