Multiplexed gene synthesis in emulsions for exploring protein functional landscapes

Large-scale gene synthesis in tiny droplets Gene synthesis technology is important for functional characterization of DNA sequences and for the development of synthetic biology. However, current methods are limited by their low scalability and high cost. Plesa et al. developed a gene synthesis method, DropSynth, which uses barcoded beads to concentrate oligos and subsequently assemble them into synthetic genes within picoliter emulsion droplets. DropSynth allows generation of large libraries of thousands of genes and functional testing of all possible mutations of a particular sequence. Science, this issue p. 343 A gene synthesis method, DropSynth, allows for the synthesis and characterization of thousands of pooled genes. Improving our ability to construct and functionally characterize DNA sequences would broadly accelerate progress in biology. Here, we introduce DropSynth, a scalable, low-cost method to build thousands of defined gene-length constructs in a pooled (multiplexed) manner. DropSynth uses a library of barcoded beads that pull down the oligonucleotides necessary for a gene’s assembly, which are then processed and assembled in water-in-oil emulsions. We used DropSynth to successfully build more than 7000 synthetic genes that encode phylogenetically diverse homologs of two essential genes in Escherichia coli. We tested the ability of phosphopantetheine adenylyltransferase homologs to complement a knockout E. coli strain in multiplex, revealing core functional motifs and reasons underlying homolog incompatibility. DropSynth coupled with multiplexed functional assays allows us to rationally explore sequence-function relationships at an unprecedented scale.

[1]  Duhee Bang,et al.  ‘Shotgun DNA synthesis’ for the high-throughput construction of large DNA molecules , 2012, Nucleic acids research.

[2]  Angus M. Sidore,et al.  A systematic comparison of error correction enzymes by next-generation sequencing , 2017, bioRxiv.

[3]  W. V. Shaw,et al.  Purification and Characterization of Phosphopantetheine Adenylyltransferase from Escherichia coli * , 1999, The Journal of Biological Chemistry.

[4]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[5]  K. Sykes,et al.  High-quality gene assembly directly from unpurified mixtures of microarray-synthesized oligonucleotides , 2010, Nucleic acids research.

[6]  S. Fields,et al.  Deep mutational scanning: a new style of protein science , 2014, Nature Methods.

[7]  Dmitry Chudakov,et al.  Local fitness landscape of the green fluorescent protein , 2016, Nature.

[8]  M. Elowitz,et al.  A synthetic three-color scaffold for monitoring genetic regulation and noise , 2010, Journal of biological engineering.

[9]  Najeeb M. Halabi,et al.  Protein Sectors: Evolutionary Units of Three-Dimensional Structure , 2009, Cell.

[10]  B. Wanner,et al.  One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Nicholas C Tang,et al.  Parallel on-chip gene synthesis and application to optimization of protein expression , 2011, Nature Biotechnology.

[12]  T. Hsiau,et al.  A Method for Multiplex Gene Synthesis Employing Error Correction Based on Expression , 2015, PloS one.

[13]  N. Ahituv,et al.  Decoding enhancers using massively parallel reporter assays. , 2015, Genomics.

[14]  Amy I Gilson,et al.  Transient protein-protein interactions perturb E. coli metabolome and cause gene dosage toxicity , 2016, bioRxiv.

[15]  H. Mori,et al.  Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection , 2006, Molecular systems biology.

[16]  J. Shendure,et al.  The power of multiplexed functional analysis of genetic variants , 2016, Nature Protocols.

[17]  An Experimentally Determined Evolutionary Model Dramatically Improves Phylogenetic Fit , 2014, Molecular biology and evolution.

[18]  D. Baker,et al.  Global analysis of protein folding using massively parallel design, synthesis, and testing , 2017, Science.

[19]  Thomas A. Hopf,et al.  Mutation effects predicted from sequence co-variation , 2017, Nature Biotechnology.

[20]  H. Ni,et al.  Discovery of Inhibitors of 4′-Phosphopantetheine Adenylyltransferase (PPAT) To Validate PPAT as a Target for Antibacterial Therapy , 2013, Antimicrobial Agents and Chemotherapy.

[21]  T. Izard The crystal structures of phosphopantetheine adenylyltransferase with bound substrates reveal the enzyme's catalytic mechanism. , 2002, Journal of molecular biology.

[22]  Sriram Kosuri,et al.  Scalable gene synthesis by selective amplification of DNA pools from high-fidelity microchips , 2010, Nature Biotechnology.

[23]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[24]  Mona Singh,et al.  Predicting functionally important residues from sequence conservation , 2007, Bioinform..

[25]  Jay Shendure,et al.  Accurate gene synthesis with tag-directed retrieval of sequence-verified DNA molecules , 2012, Nature Methods.

[26]  A. Emili,et al.  Interaction network containing conserved and essential protein complexes in Escherichia coli , 2005, Nature.

[27]  Tilo Buschmann,et al.  Levenshtein error-correcting barcodes for multiplexed DNA sequencing , 2013, BMC Bioinformatics.

[28]  Guillaume J. Filion,et al.  Starcode: sequence clustering based on all-pairs search , 2015, Bioinform..

[29]  A. Emili,et al.  Global Functional Atlas of Escherichia coli Encompassing Previously Uncharacterized Proteins , 2009, PLoS biology.

[30]  T. Terwilliger,et al.  Engineering and characterization of a superfolder green fluorescent protein , 2006, Nature Biotechnology.

[31]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[32]  David K. Smith,et al.  ggtree: an r package for visualization and annotation of phylogenetic trees with their covariates and other associated data , 2017 .

[33]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[34]  Christopher A. Voigt,et al.  Ribozyme-based insulator parts buffer synthetic circuits from genetic context , 2012, Nature Biotechnology.

[35]  Nicholas C Tang,et al.  DNA synthesis, assembly and applications in synthetic biology. , 2012, Current opinion in chemical biology.

[36]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[37]  T. Izard A Novel Adenylate Binding Site Confers Phosphopantetheine Adenylyltransferase Interactions with Coenzyme A , 2003, Journal of bacteriology.

[38]  E. Cox,et al.  Site-specific chromosomal integration of large synthetic constructs , 2010, Nucleic acids research.

[39]  David Baker,et al.  Multiplex pairwise assembly of array-derived DNA oligonucleotides , 2015, Nucleic acids research.

[40]  T. Izard,et al.  The crystal structure of a novel bacterial adenylyltransferase reveals half of sites reactivity , 1999, The EMBO journal.

[41]  G. Church,et al.  Large-scale de novo DNA synthesis: technologies and applications , 2014, Nature Methods.

[42]  Thomas A. Hopf,et al.  Protein structure prediction from sequence variation , 2012, Nature Biotechnology.

[43]  Takaya Saito,et al.  Precrec: fast and accurate precision–recall and ROC curve calculations in R , 2016, Bioinform..

[44]  Andrew D Ellington,et al.  Synthetic DNA Synthesis and Assembly: Putting the Synthetic in Synthetic Biology. , 2017, Cold Spring Harbor perspectives in biology.

[45]  B Wieland,et al.  Identification of novel essential Escherichia coli genes conserved among pathogenic bacteria. , 2001, Journal of molecular microbiology and biotechnology.

[46]  Mark D'Souza,et al.  From Genetic Footprinting to Antimicrobial Drug Targets: Examples in Cofactor Biosynthetic Pathways , 2002, Journal of bacteriology.

[47]  Debora S. Marks,et al.  Quantification of the effect of mutations using a global probability model of natural sequence variation , 2015, 1510.04612.

[48]  Silvio C. E. Tosatto,et al.  InterPro in 2017—beyond protein family and domain annotations , 2016, Nucleic Acids Res..