Codon Context Optimization in Synthetic Gene Design

Advances in de novo synthesis of DNA and computational gene design methods make possible the customization of genes by direct manipulation of features such as codon bias and mRNA secondary structure. Codon context is another feature significantly affecting mRNA translational efficiency, but existing methods and tools for evaluating and designing novel optimized protein coding sequences utilize untested heuristics and do not provide quantifiable guarantees on design quality. In this study we examine statistical properties of codon context measures in an effort to better understand the phenomenon. We analyze the computational complexity of codon context optimization and design exact and efficient heuristic gene recoding algorithms under reasonable constraint models. We also present a web-based tool for evaluating codon context bias in the appropriate context.

[1]  Alan Villalobos,et al.  Design Parameters to Control Synthetic Gene Expression in Escherichia coli , 2009, PloS one.

[2]  Christopher A. Voigt,et al.  Automated design of synthetic ribosome binding sites to control protein expression , 2016 .

[3]  T. Ikemura Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. , 1981, Journal of molecular biology.

[4]  Vishvanath Nene,et al.  Faculty Opinions recommendation of Live attenuated influenza virus vaccines by computer-aided rational design. , 2010 .

[5]  Sean D. Hooper,et al.  Detection of Genes with Atypical Nucleotide Sequence in Microbial Genomes , 2002, Journal of Molecular Evolution.

[6]  P. Sharp,et al.  The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications. , 1987, Nucleic acids research.

[7]  David Tollervey,et al.  Coding-Sequence Determinants of Gene Expression in Escherichia coli , 2009, Science.

[8]  Dimitris Papamichail,et al.  Computational Tools and Algorithms for Designing Customized Synthetic Genes , 2014, Front. Bioeng. Biotechnol..

[9]  Manuel A. S. Santos,et al.  Comparative context analysis of codon pairs on an ORFeome scale , 2005, Genome Biology.

[10]  H. Margalit,et al.  Hierarchy of sequence-dependent features associated with prokaryotic translation. , 2003, Genome research.

[11]  Kathryn A. Dowsland,et al.  Simulated Annealing , 1989, Encyclopedia of GIS.

[12]  Paul M. Sharp,et al.  Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes , 1986, Nucleic Acids Res..

[13]  David S. Johnson,et al.  Local Optimization and the Traveling Salesman Problem , 1990, ICALP.

[14]  Ivan Ivanov,et al.  Missing Codon Pairs in the Genome of Escherichia Coli , 2002, Bioinform..

[15]  Dong-Yup Lee,et al.  Codon Optimization OnLine (COOL): a web-based multi-objective optimization platform for synthetic gene design , 2014, Bioinform..

[16]  S. Turvey,et al.  Young Children Responses in Human Neonates and Very Dependent Type I IFN − Induced and RIGI − Attenuation of Respiratory Syncytial Virus , 2014 .

[17]  Reinhard Wolf,et al.  Coding-Sequence Determinants of Gene Expression in Escherichia coli , 2009 .

[18]  Peter Rossmanith,et al.  Simulated Annealing , 2008, Taschenbuch der Algorithmen.

[19]  Dong Xu,et al.  Quantitative relationship between synonymous codon usage bias and GC composition across unicellular genomes , 2004, BMC Evolutionary Biology.

[20]  S. Govindarajan,et al.  Codon bias and heterologous protein expression. , 2004, Trends in biotechnology.

[21]  José Luís Oliveira,et al.  EuGene: maximizing synthetic gene design for heterologous expression , 2016, Bioinform..

[22]  J. R. Coleman,et al.  Virus Attenuation by Genome-Scale Changes in Codon Pair Bias , 2008, Science.

[23]  Raymond F. Gesteland,et al.  Computational identification of putative programmed translational frameshift sites , 2002, Bioinform..

[24]  L. Wernisch,et al.  Solving the riddle of codon usage preferences: a test for translational selection. , 2004, Nucleic acids research.

[25]  Longlong Yang,et al.  HP-PRRSV is attenuated by de-optimization of codon pair bias in its RNA-dependent RNA polymerase nsp9 gene. , 2015, Virology.

[26]  G. W. Hatfield,et al.  Codon Pair Utilization Biases Influence Translational Elongation Step Times (*) , 1995, The Journal of Biological Chemistry.

[27]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[28]  Alexei Fedorov,et al.  Regularities of context-dependent codon bias in eukaryotic genes. , 2002, Nucleic acids research.

[29]  Arnold J Levine,et al.  Tissue-specific codon usage and the expression of human genes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[30]  G. W. Hatfield,et al.  Nonrandom utilization of codon pairs in Escherichia coli. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Masaru Tomita,et al.  The 'weighted sum of relative entropy': a new index for synonymous codon usage bias. , 2004, Gene.

[32]  F. Wright The 'effective number of codons' used in a gene. , 1990, Gene.

[33]  Dimitris Papamichail,et al.  Designed reduction of Streptococcus pneumoniae pathogenicity via synthetic changes in virulence factor codon-pair bias. , 2011, The Journal of infectious diseases.

[34]  Jean Peccoud,et al.  Gene synthesis demystified. , 2009, Trends in biotechnology.

[35]  Alan Villalobos,et al.  Designing genes for successful protein expression. , 2011, Methods in enzymology.

[36]  J. Crow,et al.  THE NUMBER OF ALLELES THAT CAN BE MAINTAINED IN A FINITE POPULATION. , 1964, Genetics.