Protein Structure from Experimental Evolution.

Natural evolution encodes rich information about the structure and function of biomolecules in the genetic record. Previously, statistical analysis of co-variation patterns in natural protein families has enabled the accurate computation of 3D structures. Here, we explored generating similar information by experimental evolution, starting from a single gene and performing multiple cycles of in vitro mutagenesis and functional selection in Escherichia coli. We evolved two antibiotic resistance proteins, β-lactamase PSE1 and acetyltransferase AAC6, and obtained hundreds of thousands of diverse functional sequences. Using evolutionary coupling analysis, we inferred residue interaction constraints that were in agreement with contacts in known 3D structures, confirming genetic encoding of structural constraints in the selected sequences. Computational protein folding with interaction constraints then yielded 3D structures with the same fold as natural relatives. This work lays the foundation for a new experimental method (3Dseq) for protein structure determination, combining evolution experiments with inference of residue interactions from sequence information. A record of this paper's Transparent Peer Review process is included in the Supplemental Information.

[1]  D. Hartl,et al.  Misfolded proteins impose a dosage-dependent fitness cost and trigger a cytosolic unfolded protein response in yeast , 2010, Proceedings of the National Academy of Sciences.

[2]  M. DePristo,et al.  Missense meanderings in sequence space: a biophysical view of protein evolution , 2005, Nature Reviews Genetics.

[3]  S. Fields,et al.  A novel genetic system to detect protein–protein interactions , 1989, Nature.

[4]  M. Lässig,et al.  Molecular evolution under fitness fluctuations. , 2008, Physical review letters.

[5]  A. Pastore,et al.  Protein Structural Information and Evolutionary Landscape by In Vitro Evolution , 2019, bioRxiv.

[6]  Thomas A. Hopf,et al.  Protein structure prediction from sequence variation , 2012, Nature Biotechnology.

[7]  F. J. Poelwijk,et al.  The spatial architecture of protein function and adaptation , 2012, Nature.

[8]  Alpan Raval,et al.  Evolution favors protein mutational robustness in sufficiently large populations , 2007 .

[9]  Ágnes Tóth-Petróczy,et al.  Systematic Mapping of Protein Mutational Space by Prolonged Drift Reveals the Deleterious Effects of Seemingly Neutral Mutations , 2015, PLoS Comput. Biol..

[10]  R. Lande NATURAL SELECTION AND RANDOM GENETIC DRIFT IN PHENOTYPIC EVOLUTION , 1976, Evolution; international journal of organic evolution.

[11]  Thomas A. Hopf,et al.  Three-Dimensional Structures of Membrane Proteins from Genomic Sequencing , 2012, Cell.

[12]  Takeshi Itoh,et al.  Acceleration of genomic evolution caused by enhanced mutation rate in endocellular symbionts , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[13]  J H Gillespie,et al.  The role of population size in molecular evolution. , 1999, Theoretical population biology.

[14]  S. Wright Evolution in mendelian populations , 1931 .

[15]  Peter Virnau,et al.  Intricate Knots in Proteins: Function and Evolution , 2006, PLoS Comput. Biol..

[16]  Steven Salzberg,et al.  BIOINFORMATICS ORIGINAL PAPER , 2004 .

[17]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[18]  H. Bujard,et al.  Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/I1-I2 regulatory elements. , 1997, Nucleic acids research.

[19]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[20]  F. Arnold,et al.  Protein stability promotes evolvability. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[21]  J. Frère,et al.  Catalytic properties of class A beta-lactamases: efficiency and diversity. , 1998, The Biochemical journal.

[22]  Thomas A. Hopf,et al.  Sequence co-evolution gives 3D contacts and structures of protein complexes , 2014, eLife.

[23]  J. Haldane,et al.  Polymorphism due to selection of varying direction , 1963, Journal of Genetics.

[24]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[25]  Arjun Ravikumar,et al.  Scalable, Continuous Evolution of Genes at Mutation Rates above Genomic Error Thresholds , 2018, Cell.

[26]  J. Gillespie The causes of molecular evolution , 1991 .

[27]  P. Bork,et al.  ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data , 2016, Molecular biology and evolution.

[28]  T. Terwilliger,et al.  Rapid protein-folding assay using green fluorescent protein , 1999, Nature Biotechnology.

[29]  David R. Liu,et al.  A System for the Continuous Directed Evolution of Biomolecules , 2011, Nature.

[30]  Carl T. Bergstrom,et al.  The evolution of mutator genes in bacterial populations: the roles of environmental change and timing. , 2003, Genetics.

[31]  G. Pietro Pymol script: loadBfacts.py , 2014 .

[32]  S. Falkow,et al.  Construction and characterization of new cloning vehicles. II. A multipurpose cloning system. , 1977, Gene.

[33]  Frank J. Poelwijk,et al.  Tradeoffs and Optimality in the Evolution of Gene Regulation , 2011, Cell.

[34]  Debora S. Marks,et al.  EVfold.org: Evolutionary Couplings and Protein 3D Structure Prediction , 2015, bioRxiv.

[35]  D. Tautz,et al.  The evolutionary origin of orphan genes , 2011, Nature Reviews Genetics.

[36]  T. Terwilliger,et al.  Protein tagging and detection with engineered self-assembling fragments of green fluorescent protein , 2005, Nature Biotechnology.

[37]  David R. Liu,et al.  Development of potent in vivo mutagenesis plasmids with broad mutational spectra , 2015, Nature Communications.

[38]  S. Radford,et al.  Optimizing protein stability in vivo. , 2009, Molecular cell.

[39]  Eric T. Boder,et al.  Yeast surface display for screening combinatorial polypeptide libraries , 1997, Nature Biotechnology.

[40]  Robert D. Finn,et al.  HMMER web server: 2018 update , 2018, Nucleic Acids Res..

[41]  Dan S. Tawfik,et al.  Intense neutral drifts yield robust and evolvable consensus proteins. , 2008, Journal of molecular biology.

[42]  Thomas A. Hopf,et al.  Protein 3D Structure Computed from Evolutionary Sequence Variation , 2011, PloS one.

[43]  A. Brunger Version 1.2 of the Crystallography and NMR system , 2007, Nature Protocols.

[44]  Yang Zhang,et al.  How significant is a protein structure similarity with TM-score = 0.5? , 2010, Bioinform..

[45]  Dan S. Tawfik,et al.  Stability effects of mutations and protein evolvability. , 2009, Current opinion in structural biology.

[46]  Ben Lehner,et al.  Determining protein structures using deep mutagenesis , 2019, Nature Genetics.

[47]  J Moult,et al.  Bacterial resistance to beta-lactam antibiotics: crystal structure of beta-lactamase from Staphylococcus aureus PC1 at 2.5 A resolution. , 1987, Science.

[48]  J. Keith Joung,et al.  Activation of prokaryotic transcription through arbitrary protein–protein contacts , 1997, Nature.

[49]  L. Passmore,et al.  Insights into the molecular basis for the carbenicillinase activity of PSE-4 beta-lactamase from crystallographic and kinetic studies. , 2001, Biochemistry.

[50]  Frances H Arnold,et al.  Neutral genetic drift can aid functional protein evolution , 2007, 0705.0201.

[51]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[52]  G. Stormo,et al.  Correlated mutations in models of protein sequences: phylogenetic and structural effects , 1999 .

[53]  Eric Klavins,et al.  A Low Cost, Customizable Turbidostat for Use in Synthetic Circuit Characterization , 2014, ACS synthetic biology.

[54]  C. Sander,et al.  Inferring protein 3D structure from deep mutation scans , 2019, Nature Genetics.

[55]  Remy Chait,et al.  Evolutionary paths to antibiotic resistance under dynamically sustained drug selection , 2011, Nature Genetics.

[56]  G. Bell Fluctuating selection: the perpetual renewal of adaptation in variable environments , 2010, Philosophical Transactions of the Royal Society B: Biological Sciences.

[57]  Peter Virnau,et al.  Protein knot server: detection of knots in protein structures , 2007, Nucleic Acids Res..

[58]  Debora S. Marks,et al.  Inferring Pairwise Interactions from Biological Data Using Maximum-Entropy Probability Models , 2015, PLoS Comput. Biol..

[59]  Debora S Marks,et al.  Deep generative models of genetic variation capture the effects of mutations , 2018, Nature Methods.

[60]  Dan S. Tawfik,et al.  Directed enzyme evolution via small and effective neutral drift libraries , 2008, Nature Methods.