Predicting the Genetic Stability of Engineered DNA Sequences with the EFM Calculator.

Unwanted evolution can rapidly degrade the performance of genetically engineered circuits and metabolic pathways installed in living organisms. We created the Evolutionary Failure Mode (EFM) Calculator to computationally detect common sources of genetic instability in an input DNA sequence. It predicts two types of mutational hotspots: deletions mediated by homologous recombination and indels caused by replication slippage on simple sequence repeats. We tested the performance of our algorithm on genetic circuits that were previously redesigned for greater evolutionary reliability and analyzed the stability of sequences in the iGEM Registry of Standard Biological Parts. More than half of the parts in the Registry are predicted to experience >100-fold elevated mutation rates due to the inclusion of unstable sequence configurations. We anticipate that the EFM Calculator will be a useful negative design tool for avoiding volatile DNA encodings, thereby increasing the evolutionary lifetimes of synthetic biology devices.

[1]  A. Arkin,et al.  Fast, cheap and somewhat in control , 2006, Genome Biology.

[2]  A. Vogler,et al.  Effect of Repeat Copy Number on Variable-Number Tandem Repeat Mutations in Escherichia coli O157:H7 , 2006, Journal of bacteriology.

[3]  D. Endy,et al.  Refinement and standardization of synthetic biological parts and devices , 2008, Nature Biotechnology.

[4]  L. Wahl,et al.  Rates of transposition in Escherichia coli , 2013, Biology Letters.

[5]  F. Blattner,et al.  Emergent Properties of Reduced-Genome Escherichia coli , 2006, Science.

[6]  Haixu Tang,et al.  Rate and molecular spectrum of spontaneous mutations in the bacterium Escherichia coli as determined by whole-genome sequencing , 2012, Proceedings of the National Academy of Sciences.

[7]  N. Pochet,et al.  Sequence-based estimation of minisatellite and microsatellite repeat variability. , 2007, Genome research.

[8]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[9]  Gabor T. Marth,et al.  Scribl: an HTML5 Canvas-based graphics library for visualizing genomic data over the web , 2013, Bioinform..

[10]  D. Prazeres,et al.  Recombination frequency in plasmid DNA containing direct repeats--predictive correlation with repeat and intervening sequence length. , 2008, Plasmid.

[11]  Lynn Y. Huynh,et al.  Tandem repeat regions within the Burkholderia pseudomallei genome and their application for high resolution genotyping , 2007, BMC Microbiology.

[12]  Christopher A. Voigt,et al.  Automated design of synthetic ribosome binding sites to control protein expression , 2016 .

[13]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[14]  Jeffrey E. Barrick,et al.  Engineering reduced evolutionary potential for synthetic biology. , 2014, Molecular bioSystems.

[15]  Herbert M Sauro,et al.  Designing and engineering evolutionary robust genetic circuits , 2010, Journal of biological engineering.

[16]  Paul Keim,et al.  Differential plague-transmission dynamics determine Yersinia pestis population genetic structure on local, regional, and global scales. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[17]  G. Stan,et al.  Quantifying cellular capacity identifies gene expression designs with reduced burden , 2015, Nature Methods.