Optimizing restriction site placement for synthetic genomes

Restriction enzymes are the workhorses of molecular biology. We introduce a new problem that arises in the course of our project to design virus variants to serve as potential vaccines: we wish to modify virus-length genomes to introduce large numbers of unique restriction enzyme recognition sites while preserving wild-type function by substitution of synonymous codons. We show that the resulting problem is NP-Complete, give an exponential-time algorithm, and propose effective heuristics, which we show give excellent results for five sample viral genomes. Our resulting modified genomes have several times more unique restriction sites and reduce the maximum gap between adjacent sites by three to nine-fold.

[1]  P. Evans,et al.  SiteFind: A software tool for introducing a restriction site as a marker for successful site-directed mutagenesis , 2005, BMC Molecular Biology.

[2]  Tom Mens,et al.  A survey of software refactoring , 2004, IEEE Transactions on Software Engineering.

[3]  Drew Endy,et al.  DNA synthesis and biological security , 2007, Nature Biotechnology.

[4]  J. Boeke,et al.  GeneDesign: rapid, automated design of multikilobase synthetic genes. , 2006, Genome research.

[5]  Jean Peccoud,et al.  Gene synthesis demystified. , 2009, Trends in biotechnology.

[6]  D. Endy,et al.  Refactoring bacteriophage T7 , 2005, Molecular systems biology.

[7]  Steven Skiena,et al.  Designing better phages , 2001, ISMB.

[8]  A. Paul,et al.  Chemical Synthesis of Poliovirus cDNA: Generation of Infectious Virus in the Absence of Natural Template , 2002, Science.

[9]  Rong-chii Duh,et al.  Approximation of k-set cover by semi-local optimization , 1997, STOC '97.

[10]  M. Ermolaeva,et al.  Synonymous codon usage in bacteria. , 2001, Current issues in molecular biology.

[11]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[12]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[13]  Richard J. Roberts,et al.  NEBcutter: a program to cleave DNA with restriction enzymes , 2003, Nucleic Acids Res..

[14]  Richard J. Roberts,et al.  How restriction enzymes became the workhorses of molecular biology , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[15]  E. Wimmer,et al.  Synthetic viruses: a new opportunity to understand and prevent viral disease , 2009, Nature Biotechnology.

[16]  Mihalis Yannakakis,et al.  Optimization, approximation, and complexity classes , 1991, STOC '88.

[17]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[18]  Emilio Fernández,et al.  Restriction enzyme site-directed amplification PCR: a tool to identify regions flanking a marker DNA. , 2005, Analytical biochemistry.

[19]  J. R. Coleman,et al.  Virus Attenuation by Genome-Scale Changes in Codon Pair Bias , 2008, Science.

[20]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[21]  Drew Endy,et al.  GeneJax: A Prototype CAD tool in support of Genome Refactoring , 2006 .