A structure-based deep learning framework for protein engineering

While deep learning methods exist to guide protein optimization, examples of novel proteins generated with these techniques require a priori mutational data. Here we report a 3D convolutional neural network that associates amino acids with neighboring chemical microenvironments at state-of-the-art accuracy. This algorithm enables identification of novel gain-of-function mutations, and subsequent experiments confirm substantive phenotypic improvements in stability-associated phenotypes in vivo across three diverse proteins.

[1]  Jay Shendure,et al.  Quantitative Missense Variant Effect Prediction Using Large-Scale Mutagenesis Data. , 2017, Cell systems.

[2]  Adam Nelson,et al.  Extending enzyme molecular recognition with an expanded amino acid alphabet , 2017, Proceedings of the National Academy of Sciences.

[3]  Dan S. Tawfik,et al.  On the Potential Origins of the High Stability of Reconstructed Ancestral Proteins. , 2016, Molecular biology and evolution.

[4]  T. Terwilliger,et al.  New Molecular Reporters for Rapid Protein Folding Assays , 2008, PloS one.

[5]  George M. Church,et al.  Unified rational protein engineering with sequence-based deep representation learning , 2019, Nature Methods.

[6]  Russ B. Altman,et al.  3D deep convolutional neural networks for amino acid environment similarity analysis , 2017, BMC Bioinformatics.

[7]  Gerhard Klebe,et al.  PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations , 2007, Nucleic Acids Res..

[8]  Vincent Breton,et al.  PDB_REDO: automated re-refinement of X-ray structure models in the PDB , 2009, Journal of applied crystallography.

[9]  E. Snapp,et al.  Cysteineless non-glycosylated monomeric blue fluorescent protein, secBFP2, for studies in the eukaryotic secretory pathway. , 2013, Biochemical and biophysical research communications.

[10]  Yaoqi Zhou,et al.  FreeSASA: An open source C library for solvent accessible surface area calculations , 2016, F1000Research.

[11]  Holger Dobbek,et al.  Increased folding stability of TEM-1 beta-lactamase by in vitro selection. , 2008, Journal of molecular biology.

[12]  Jon E. Ness,et al.  Predicting the emergence of antibiotic resistance by directed evolution and structural analysis , 2001, Nature Structural Biology.

[13]  A. Wagner,et al.  Mistranslation drives the evolution of robustness in TEM-1 β-lactamase , 2015, Proceedings of the National Academy of Sciences.

[14]  F. Baquero,et al.  Implication of Ile-69 and Thr-182 residues in kinetic characteristics of IRT-3 (TEM-32) beta-lactamase , 1996, Antimicrobial agents and chemotherapy.

[15]  Bryan S. Der,et al.  Evolution of a highly active and enantiospecific metalloenzyme from short peptides , 2018, Science.

[16]  Lei Jia,et al.  Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools , 2015, PloS one.

[17]  G. Schreiber,et al.  Assessing computational methods for predicting protein stability upon mutation: good on average but not in the details. , 2009, Protein engineering, design & selection : PEDS.

[18]  J. Poulain,et al.  Capturing the mutational landscape of the beta-lactamase TEM-1 , 2013, Proceedings of the National Academy of Sciences.

[19]  J. Nielsen,et al.  Structure and Dynamics of a Promiscuous Xanthan Lyase from Paenibacillus nanensis and the Design of Variants with Increased Stability and Activity. , 2019, Cell chemical biology.

[20]  Gregory R. Bowman,et al.  Prediction of New Stabilizing Mutations Based on Mechanistic Insights from Markov State Models , 2017, ACS central science.

[21]  John Z. H. Zhang,et al.  Computational Protein Design with Deep Learning Neural Networks , 2018, Scientific Reports.

[22]  Zachary Wu,et al.  Machine learning-assisted directed protein evolution with combinatorial libraries , 2019, Proceedings of the National Academy of Sciences.

[23]  James G. Lyons,et al.  SPIN2: Predicting sequence profiles from protein structures using deep neural networks , 2018, Proteins.

[24]  Rachel Karchin,et al.  Network Models of TEM β-Lactamase Mutations Coevolving under Antibiotic Selection Show Modular Structure and Anticipate Evolutionary Trajectories , 2011, PLoS Comput. Biol..

[25]  Yuedong Yang,et al.  Direct prediction of profiles of sequences compatible with a protein structure by neural networks with fragment‐based local and energy‐based nonlocal profiles , 2014, Proteins.