Evolutionary design of multiple genes encoding the same protein

Motivation: Enhancing expression levels of a target protein is an important goal in synthetic biology. A widely used strategy is to integrate multiple copies of genes encoding a target protein into a host organism genome. Integrating highly similar sequences, however, can induce homologous recombination between them, resulting in the ultimate reduction of the number of integrated genes. Results: We propose a method for designing multiple protein‐coding sequences (i.e. CDSs) that are unlikely to induce homologous recombination, while encoding the same protein. The method, which is based on multi‐objective genetic algorithm, is intended to design a set of CDSs whose nucleotide sequences are as different as possible and whose codon usage frequencies are as highly adapted as possible to the host organism. We show that our method not only successfully designs a set of intended CDSs, but also provides insight into the trade‐off between nucleotide differences among gene copies and codon usage frequencies. Availability and Implementation: Our method, named Tandem Designer, is available as a web‐based application at http://tandem.trahed.jp/tandem/. Contact: terai_goro@intec.co.jp or asai@k.u‐tokyo.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  J J Clare,et al.  Production of mouse epidermal growth factor in yeast: high-level secretion using Pichia pastoris strains containing multiple gene copies. , 1991, Gene.

[2]  Kathleen A. Curran,et al.  Design of synthetic yeast promoters via tuning of nucleosome architecture , 2014, Nature Communications.

[3]  Keith E. J. Tyo,et al.  Stabilized gene duplication enables long-term selection-free heterologous pathway expression , 2009, Nature Biotechnology.

[4]  Lothar Thiele,et al.  Multiobjective Optimization Using Evolutionary Algorithms - A Comparative Case Study , 1998, PPSN.

[5]  Hiroki Arimura,et al.  Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications , 2001, CPM.

[6]  Qian Wang,et al.  A rapid and reliable strategy for chromosomal integration of gene(s) with multiple copies , 2015, Scientific Reports.

[7]  P Manivasakam,et al.  Micro-homology mediated PCR targeting in Saccharomyces cerevisiae. , 1995, Nucleic acids research.

[8]  P. Sharp,et al.  The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications. , 1987, Nucleic acids research.

[9]  Santiago Garcia-Vallvé,et al.  Working toward a new NIOSH. , 1996, Nucleic Acids Res..

[10]  José Luís Oliveira,et al.  EuGene: maximizing synthetic gene design for heterologous expression , 2016, Bioinform..

[11]  Dong-Yup Lee,et al.  Codon Optimization OnLine (COOL): a web-based multi-objective optimization platform for synthetic gene design , 2014, Bioinform..

[12]  Henry Huang,et al.  Homologous recombination in Escherichia coli: dependence on substrate length and homology. , 1986, Genetics.

[13]  A. A. Zainullin,et al.  Homologous recombination between plasmid and chromosomal DNA in Bacillus subtilis requires approximately 70 bp of homology , 1992, Molecular and General Genetics MGG.

[14]  Karen M Polizzi,et al.  Can too many copies spoil the broth? , 2013, Microbial Cell Factories.

[15]  Miguel Rocha,et al.  D-Tailor: automated analysis and design of DNA sequences , 2014, Bioinform..

[16]  Christopher V. Rao,et al.  Computational design of orthogonal ribosomes , 2008, Nucleic acids research.

[17]  Nicole Borth,et al.  Effects of gene dosage, promoters, and substrates on unfolded protein stress of recombinant Pichia pastoris , 2004, Biotechnology and bioengineering.

[18]  T. Lu,et al.  Tunable and Multifunctional Eukaryotic Transcription Factors Based on CRISPR/Cas , 2013, ACS synthetic biology.

[19]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[20]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[21]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[22]  Ju Chu,et al.  A systematical investigation on the genetic stability of multi-copy Pichia pastoris strains , 2009, Biotechnology Letters.

[23]  Michael R. Green,et al.  Dissecting the Regulatory Circuitry of a Eukaryotic Genome , 1998, Cell.

[24]  S. Swaminathan,et al.  Expression of hepatitis B surface antigen in the methylotrophic yeast Pichia pastoris using the GAP promoter. , 2001, Journal of biotechnology.

[25]  Alan Villalobos,et al.  Gene Designer: a synthetic biology tool for constructing artificial DNA segments , 2006, BMC Bioinformatics.

[26]  Michael A. Romanos,et al.  Rapid Selection Using G418 of High Copy Number Transformants of Pichia pastoris for High–level Foreign Gene Expression , 1994, Bio/Technology.