An Evolution-Based Approach to De Novo Protein Design.

EvoDesign is a computational algorithm that allows the rapid creation of new protein sequences that are compatible with specific protein structures. As such, it can be used to optimize protein stability, to resculpt the protein surface to eliminate undesired protein-protein interactions, and to optimize protein-protein binding. A major distinguishing feature of EvoDesign in comparison to other protein design programs is the use of evolutionary information in the design process to guide the sequence search toward native-like sequences known to adopt structurally similar folds as the target. The observed frequencies of amino acids in specific positions in the structure in the form of structural profiles collected from proteins with similar folds and complexes with similar interfaces can implicitly capture many subtle effects that are essential for correct folding and protein-binding interactions. As a result of the inclusion of evolutionary information, the sequences designed by EvoDesign have native-like folding and binding properties not seen by other physics-based design methods. In this chapter, we describe how EvoDesign can be used to redesign proteins with a focus on the computational and experimental procedures that can be used to validate the designs.

[1]  P. S. Kim,et al.  High-resolution protein design with backbone freedom. , 1998, Science.

[2]  J. Beckwith,et al.  The Role of the Thioredoxin and Glutaredoxin Pathways in Reducing Protein Disulfide Bonds in the Escherichia coliCytoplasm* , 1997, The Journal of Biological Chemistry.

[3]  D. Baker,et al.  Role of conformational sampling in computing mutation‐induced changes in protein structure and stability , 2011, Proteins.

[4]  Ian W. Davis,et al.  The backrub motion: how protein backbone shrugs when a sidechain dances. , 2006, Structure.

[5]  B. Rost,et al.  Critical assessment of methods of protein structure prediction—Round VIII , 2009, Proteins.

[6]  K. Büssow,et al.  Fast identification of folded human protein domains expressed in E. coli suitable for structural analysis , 2004, BMC Structural Biology.

[7]  Eric A. Althoff,et al.  De Novo Computational Design of Retro-Aldol Enzymes , 2008, Science.

[8]  D. Baker,et al.  Computational redesign of protein-protein interaction specificity , 2004, Nature Structural &Molecular Biology.

[9]  D. Bers,et al.  Potentiation of fractional sarcoplasmic reticulum calcium release by total and free intra-sarcoplasmic reticulum calcium concentration. , 2000, Biophysical journal.

[10]  W. P. Russ,et al.  Evolutionary information for specifying a protein fold , 2005, Nature.

[11]  Germán L. Rosano,et al.  Recombinant protein expression in Escherichia coli: advances and challenges , 2014, Front. Microbiol..

[12]  Sitao Wu,et al.  MUSTER: Improving protein sequence profile–profile alignments by using multiple sources of structure information , 2008, Proteins.

[13]  Chinmay Y. Majmudar,et al.  Mocr: a novel fusion tag for enhancing solubility that is compatible with structural biology applications. , 2009, Protein expression and purification.

[14]  Colin A. Smith,et al.  Predicting the Tolerated Sequences for Proteins and Protein Interfaces Using RosettaBackrub Flexible Backbone Design , 2011, PloS one.

[15]  R. Konrat,et al.  Rapid assessment of protein structural stability and fold validation via NMR. , 2005, Methods in enzymology.

[16]  Tanja Kortemme,et al.  Computational Protein Design Quantifies Structural Constraints on Amino Acid Covariation , 2013, PLoS Comput. Biol..

[17]  F. Niesen,et al.  The use of differential scanning fluorimetry to detect ligand interactions that promote protein stability , 2007, Nature Protocols.

[18]  N. Pokala,et al.  Energy functions for protein design: adjustment with protein-protein complex affinities, models for the unfolded state, and negative design of solubility and specificity. , 2005, Journal of molecular biology.

[19]  Colin A. Smith,et al.  Backrub-like backbone simulation recapitulates natural protein conformational variability and improves mutant side-chain prediction. , 2008, Journal of molecular biology.

[20]  Yang Zhang,et al.  Crystal structure of designed PX domain from cytokine-independent survival kinase and implications on evolution-based protein engineering. , 2015, Journal of structural biology.

[21]  A. D. McLachlan,et al.  Profile analysis: detection of distantly related proteins. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[22]  I. Stansfield,et al.  Halting a cellular production line: responses to ribosomal pausing during translation , 2007, Biology of the cell.

[23]  S. Jana,et al.  RETRACTED ARTICLE: Strategies for efficient production of heterologous proteins in Escherichia coli , 2005, Applied Microbiology and Biotechnology.

[24]  C. Dobson,et al.  Rationalization of the effects of mutations on peptide andprotein aggregation rates , 2003, Nature.

[25]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[26]  B. Kuhlman,et al.  Computational protein design with explicit consideration of surface hydrophobic patches , 2012, Proteins.

[27]  Yang Zhang,et al.  Computational protein design and large-scale assessment by I-TASSER structure assembly simulations. , 2011, Journal of molecular biology.

[28]  John Moult,et al.  A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. , 2005, Current opinion in structural biology.

[29]  K Fidelis,et al.  A large‐scale experiment to assess protein structure prediction methods , 1995, Proteins.

[30]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[31]  Amanda L. Smith,et al.  Computational protein design enables a novel one-carbon assimilation pathway , 2015, Proceedings of the National Academy of Sciences.

[32]  J. Skolnick,et al.  Further Evidence for the Likely Completeness of the Library of Solved Single Domain Protein Structures , 2011 .

[33]  Yang Zhang,et al.  Predicting the Effect of Mutations on Protein-Protein Binding Interactions through Structure-Based Interface Profiles , 2015, PLoS Comput. Biol..

[34]  Yang Zhang,et al.  I-TASSER server for protein 3D structure prediction , 2008, BMC Bioinformatics.

[35]  Yang Zhang,et al.  Protein-protein complex structure predictions by multimeric threading and template recombination. , 2011, Structure.

[36]  Yaoqi Zhou,et al.  Energy functions in de novo protein design: current challenges and future prospects. , 2013, Annual review of biophysics.

[37]  Christian Schaefer,et al.  Protein secondary structure appears to be robust under in silico evolution while protein disorder appears not to be , 2010, Bioinform..

[38]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[39]  Michael Gribskov,et al.  Profile scanning for three-dimensional structural patterns in protein sequences , 1988, Comput. Appl. Biosci..

[40]  F. Baneyx Recombinant protein expression in Escherichia coli. , 1999, Current opinion in biotechnology.

[41]  Jeffrey Skolnick,et al.  iAlign: a method for the structural comparison of protein-protein interfaces , 2010, Bioinform..

[42]  Yang Zhang,et al.  SPICKER: A clustering approach to identify near‐native protein folds , 2004, J. Comput. Chem..

[43]  Yang Zhang,et al.  The I-TASSER Suite: protein structure and function prediction , 2014, Nature Methods.

[44]  J. Skolnick,et al.  Ab initio modeling of small proteins by iterative TASSER simulations , 2007, BMC Biology.

[45]  Yang Zhang,et al.  How significant is a protein structure similarity with TM-score = 0.5? , 2010, Bioinform..

[46]  Huan‐Xiang Zhou,et al.  Prediction of solvent accessibility and sites of deleterious mutations from protein sequence , 2005, Nucleic acids research.

[47]  Michal Linial,et al.  Exposing the co-adaptive potential of protein-protein interfaces through computational sequence design , 2010, Bioinform..

[48]  Yang Zhang,et al.  I-TASSER: a unified platform for automated protein structure and function prediction , 2010, Nature Protocols.

[49]  B. Rost,et al.  Understanding the physical properties that control protein crystallization by analysis of large-scale experimental data , 2009, Nature Biotechnology.

[50]  François Stricher,et al.  The FoldX web server: an online force field , 2005, Nucleic Acids Res..

[51]  R. Konrat,et al.  Autocorrelation analysis of NOESY data provides residue compactness for folded and unfolded proteins. , 2009, Journal of the American Chemical Society.

[52]  Roland L. Dunbrack,et al.  proteins STRUCTURE O FUNCTION O BIOINFORMATICS Improved prediction of protein side-chain conformations with SCWRL4 , 2022 .

[53]  Yang Zhang,et al.  EvoDesign: de novo protein design based on structural and evolutionary profiles , 2013, Nucleic Acids Res..

[54]  Yang Zhang,et al.  Template‐based modeling and free modeling by I‐TASSER in CASP7 , 2007, Proteins.

[55]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction—Round VII , 2007, Proteins.

[56]  Sitao Wu,et al.  ANGLOR: A Composite Machine-Learning Algorithm for Protein Backbone Torsion Angle Prediction , 2008, PloS one.

[57]  R. Burgess Refolding solubilized inclusion body proteins. , 2009, Methods in enzymology.

[58]  Yang Zhang Progress and challenges in protein structure prediction. , 2008, Current opinion in structural biology.

[59]  D. Baker,et al.  A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins. , 2003, Journal of molecular biology.

[60]  Yang Zhang,et al.  An Evolution-Based Approach to De Novo Protein Design and Case Study on Mycobacterium tuberculosis , 2013, PLoS Comput. Biol..

[61]  A. Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP)—round IX , 2011, Proteins.

[62]  R. Huber,et al.  Application of NMR in structural proteomics: screening for proteins amenable to structural analysis. , 2002, Structure.

[63]  J. Skolnick,et al.  On the origin and highly likely completeness of single-domain protein structures. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[64]  B. Kuhlman,et al.  Computational design of affinity and specificity at protein-protein interfaces. , 2009, Current opinion in structural biology.

[65]  D. Baker,et al.  Computational design of a protein-based enzyme inhibitor. , 2013, Journal of molecular biology.

[66]  D. Baker,et al.  Principles for designing ideal protein structures , 2012, Nature.

[67]  Julia M. Shifman,et al.  Exploring the origins of binding specificity through the computational redesign of calmodulin , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[68]  S. Showalter,et al.  Incorporating 1H chemical shift determination into 13C-direct detected spectroscopy of intrinsically disordered proteins in solution. , 2009, Journal of magnetic resonance.

[69]  E. Goormaghtigh,et al.  The optimization of protein secondary structure determination with infrared and circular dichroism spectra. , 2004, European journal of biochemistry.

[70]  F. J. Poelwijk,et al.  The spatial architecture of protein function and adaptation , 2012, Nature.

[71]  Burkhard Rost,et al.  Evaluation of template‐based models in CASP8 with standard measures , 2009, Proteins.

[72]  Lukasz A. Kurgan,et al.  SPINE X: Improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles , 2012, J. Comput. Chem..

[73]  Johannes Söding,et al.  Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[74]  Torsten Schwede,et al.  Assessment of CASP7 predictions for template‐based modeling targets , 2007, Proteins.

[75]  J. K. Deb,et al.  Retraction Note: Strategies for efficient production of heterologous proteins in Escherichia coli , 2014, Applied Microbiology and Biotechnology.

[76]  Tanja Kortemme,et al.  Flexible backbone sampling methods to model and design protein alternative conformations. , 2013, Methods in enzymology.

[77]  Thomas Simonson,et al.  Computational design of protein–ligand binding: Modifying the specificity of asparaginyl‐tRNA synthetase , 2009, J. Comput. Chem..