Optimizing restriction site placement for synthetic genomes

Restriction enzymes are the workhorses of molecular biology. We introduce a new problem which arises in the course of our project to design virus variants to serve as potential vaccines: we wish to modify virus-length genomes to introduce large numbers of unique restriction enzyme recognition sites while preserving wild-type function by substitution of synonymous codons. We show that the resulting problem is NP-Complete, give an exponential-time algorithm, as well as well-performing heuristics, and give excellent results for five sample viral genomes. Our resulting modified genomes have several times more unique restriction sites and reduce the maximum gap between adjacent sites by three to nine-fold.

[1]  D. Endy,et al.  Refactoring bacteriophage T7 , 2005, Molecular systems biology.

[2]  Steven Skiena,et al.  Designing better phages , 2001, ISMB.

[3]  J. Boeke,et al.  GeneDesign: rapid, automated design of multikilobase synthetic genes. , 2006, Genome research.

[4]  Rong-chii Duh,et al.  Approximation of k-set cover by semi-local optimization , 1997, STOC '97.

[5]  Jean Peccoud,et al.  Gene synthesis demystified. , 2009, Trends in biotechnology.

[6]  Mihalis Yannakakis,et al.  Optimization, approximation, and complexity classes , 1991, STOC '88.

[7]  Emilio Fernández,et al.  Restriction enzyme site-directed amplification PCR: a tool to identify regions flanking a marker DNA. , 2005, Analytical biochemistry.

[8]  J. R. Coleman,et al.  Virus Attenuation by Genome-Scale Changes in Codon Pair Bias , 2008, Science.

[9]  Drew Endy,et al.  GeneJax: A Prototype CAD tool in support of Genome Refactoring , 2006 .

[10]  P. Evans,et al.  SiteFind: A software tool for introducing a restriction site as a marker for successful site-directed mutagenesis , 2005, BMC Molecular Biology.

[11]  A. Paul,et al.  Chemical Synthesis of Poliovirus cDNA: Generation of Infectious Virus in the Absence of Natural Template , 2002, Science.

[12]  M. Ermolaeva,et al.  Synonymous codon usage in bacteria. , 2001, Current issues in molecular biology.

[13]  Steven Skiena,et al.  Live Attenuated Influenza Vaccines by Computer-Aided Rational Design , 2010, Nature Biotechnology.

[14]  Tom Mens,et al.  A survey of software refactoring , 2004, IEEE Transactions on Software Engineering.

[15]  Drew Endy,et al.  DNA synthesis and biological security , 2007, Nature Biotechnology.

[16]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[17]  Richard J. Roberts,et al.  How restriction enzymes became the workhorses of molecular biology , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[18]  E. Wimmer,et al.  Synthetic viruses: a new opportunity to understand and prevent viral disease , 2009, Nature Biotechnology.

[19]  Richard J. Roberts,et al.  NEBcutter: a program to cleave DNA with restriction enzymes , 2003, Nucleic Acids Res..

[20]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[21]  H. Kuhn The Hungarian method for the assignment problem , 1955 .