Computer-Aided Protein Directed Evolution: a Review of Web Servers, Databases and other Computational Tools for Protein Engineering

The combination of computational and directed evolution methods has proven a winning strategy for protein engineering. We refer to this approach as computer-aided protein directed evolution (CAPDE) and the review summarizes the recent developments in this rapidly growing field. We will restrict ourselves to overview the availability, usability and limitations of web servers, databases and other computational tools proposed in the last five years. The goal of this review is to provide concise information about currently available computational resources to assist the design of directed evolution based protein engineering experiment.

[1]  Jürgen Pleiss,et al.  Lipase Engineering Database , 2000, German Conference on Bioinformatics.

[2]  R. Dror,et al.  Long-timescale molecular dynamics simulations of protein structure and function. , 2009, Current opinion in structural biology.

[3]  Jürgen Pleiss,et al.  The Lactamase Engineering Database: a critical survey of TEM sequences in public databases , 2009, BMC Genomics.

[4]  Arieh Warshel,et al.  Toward accurate screening in computer-aided enzyme design. , 2009, Biochemistry.

[5]  Liang-Tsung Huang,et al.  iPTREE-STAB: interpretable decision tree based method for predicting protein stability changes upon mutations , 2007, Bioinform..

[6]  Adi Doron-Faigenboim,et al.  Selecton 2007: advanced models for detecting positive and purifying selection using a Bayesian inference approach , 2007, Nucleic Acids Res..

[7]  Tanja Kortemme,et al.  RosettaBackrub—a web server for flexible backbone protein structure modeling and design , 2010, Nucleic Acids Res..

[8]  Nir Ben-Tal,et al.  Protein stability: a single recorded mutation aids in predicting the effects of other mutations in the same amino acid site , 2011, Bioinform..

[9]  Itay Mayrose,et al.  Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues , 2002, ISMB.

[10]  Mallur S. Madhusudhan,et al.  DEPTH: a web server to compute depth and predict small-molecule binding cavities in proteins , 2011, Nucleic Acids Res..

[11]  Dusanka Janezic,et al.  ProBiS algorithm for detection of structurally similar protein binding sites by local structural alignment , 2010, Bioinform..

[12]  Gert Vriend,et al.  3DM: Systematic analysis of heterogeneous superfamily data to discover protein functionalities , 2010, Proteins.

[13]  Costas D Maranas,et al.  Recent advances in computational protein design. , 2011, Current opinion in structural biology.

[14]  Arieh Warshel,et al.  Catalysis by dihydrofolate reductase and other enzymes arises from electrostatic preorganization, not conformational motions , 2011, Proceedings of the National Academy of Sciences.

[15]  Lars Skjærven,et al.  Normal mode analysis for proteins , 2009 .

[16]  Jürgen Pleiss,et al.  The Lipase Engineering Database: a navigation and analysis tool for protein families , 2003, Nucleic Acids Res..

[17]  Jaroslav Koca,et al.  TRITON: a graphical tool for ligand-binding protein engineering , 2008, Bioinform..

[18]  Tal Pupko,et al.  ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids , 2010, Nucleic Acids Res..

[19]  L. Kay,et al.  Observing biological dynamics at atomic resolution using NMR. , 2009, Trends in biochemical sciences.

[20]  Nicholas J Turner,et al.  Directed evolution drives the next generation of biocatalysts. , 2009, Nature chemical biology.

[21]  Arieh Warshel,et al.  Coarse-grained (multiscale) simulations in studies of biophysical and chemical systems. , 2011, Annual review of physical chemistry.

[22]  Jan Brezovsky,et al.  Computational tools for designing and engineering biocatalysts. , 2009, Current opinion in chemical biology.

[23]  Yael Mandel-Gutfreund,et al.  Patch Finder Plus (PFplus): A web server for extracting and displaying positive electrostatic patches on protein surfaces , 2007, Nucleic Acids Res..

[24]  Karsten Suhre,et al.  ElNémo: a normal mode web server for protein movement analysis and the generation of templates for molecular replacement , 2004, Nucleic Acids Res..

[25]  Ichiro Matsumura,et al.  A study in molecular contingency: glutamine phosphoribosylpyrophosphate amidotransferase is a promiscuous and evolvable phosphoribosylanthranilate isomerase. , 2008, Journal of molecular biology.

[26]  Jürgen Pleiss,et al.  The Cytochrome P450 Engineering Database: integration of biochemical properties , 2009, BMC Biochemistry.

[27]  Majid Masso,et al.  AUTO-MUTE: web-based tools for predicting stability changes in proteins due to single amino acid replacements. , 2010, Protein engineering, design & selection : PEDS.

[28]  Arieh Warshel,et al.  Challenges and advances in validating enzyme design proposals: the case of kemp eliminase catalysis. , 2011, Biochemistry.

[29]  François Stricher,et al.  A graphical interface for the FoldX forcefield , 2011, Bioinform..

[30]  Dusanka Janezic,et al.  ProBiS-Database: Precalculated Binding Site Similarities and Local Pairwise Alignments of PDB Structures , 2012, J. Chem. Inf. Model..

[31]  Klaus Schulten,et al.  Discovery through the computational microscope. , 2009, Structure.

[32]  G. Huisman,et al.  Engineering the third wave of biocatalysis , 2012, Nature.

[33]  D. Hilvert,et al.  Protein design by directed evolution. , 2008, Annual review of biophysics.

[34]  Jaroslav Koca,et al.  TRITON: in silico construction of protein mutants and prediction of their activities , 2000, Bioinform..

[35]  Tanja Kortemme,et al.  Backbone flexibility in computational protein design. , 2009, Current opinion in biotechnology.

[36]  Igor B Kuznetsov,et al.  Ordered conformational change in the protein backbone: Prediction of conformationally variable positions from sequence and low‐resolution structural data , 2008, Proteins.

[37]  Michael McDuffie,et al.  FlexPred: a web-server for predicting residue positions involved in conformational switches in proteins , 2008, Bioinformation.

[38]  Jan Marienhagen,et al.  MuteinDB: the mutein database linking substrates, products and enzymatic reactions directly with genetic variants of enzymes , 2012, Database J. Biol. Databases Curation.

[39]  K. Ranaghan,et al.  Protein dynamics and enzyme catalysis: insights from simulations. , 2011, Biochimica et biophysica acta.

[40]  Elisabeth L. Humphris,et al.  Prediction of protein-protein interface sequence diversity using flexible backbone computational protein design. , 2008, Structure.

[41]  Eytan Ruppin,et al.  MuD: an interactive web server for the prediction of non-neutral substitutions using protein structural data , 2010, Nucleic Acids Research.

[42]  Lukasz Kurgan,et al.  On the relation between residue flexibility and local solvent accessibility in proteins , 2009, Proteins.

[43]  Nathalie Reuter,et al.  WEBnm@: a web application for normal mode analyses of proteins , 2005, BMC Bioinformatics.

[44]  Michael J. E. Sternberg,et al.  3DLigandSite: predicting ligand-binding sites using similar structures , 2010, Nucleic Acids Res..

[45]  Narayanaswamy Srinivasan,et al.  Nucleic Acids Research Advance Access published June 21, 2007 PIC: Protein Interactions Calculator , 2007 .

[46]  Motonori Ota,et al.  The Protein Mutant Database , 1999, Nucleic Acids Res..

[47]  H. Berendsen,et al.  Systematic analysis of domain motions in proteins from conformational change: New results on citrate synthase and T4 lysozyme , 1998, Proteins.

[48]  M. Michael Gromiha,et al.  SRide: a server for identifying stabilizing residues in proteins , 2005, Nucleic Acids Res..

[49]  Ronny Martínez,et al.  Temperature effects on structure and dynamics of the psychrophilic protease subtilisin S41 and its thermostable mutants in solution. , 2011, Protein engineering, design & selection : PEDS.

[50]  Jie Liang,et al.  Chapter 4. Predicting and characterizing protein functions through matching geometric and evolutionary patterns of binding surfaces. , 2008, Advances in protein chemistry and structural biology.

[51]  P. Babbitt,et al.  Enzyme (re)design: lessons from natural evolution and computation. , 2009, Current opinion in chemical biology.

[52]  Arieh Warshel,et al.  At the dawn of the 21st century: Is dynamics the missing link for understanding enzyme catalysis? , 2010, Proteins.

[53]  A. Atilgan,et al.  Direct evaluation of thermal fluctuations in proteins using a single-parameter harmonic potential. , 1997, Folding & design.

[54]  Yuval Nov,et al.  When Second Best Is Good Enough: Another Probabilistic Look at Saturation Mutagenesis , 2011, Applied and Environmental Microbiology.

[55]  M. Michael Gromiha,et al.  CUPSAT: prediction of protein stability upon point mutations , 2006, Nucleic Acids Res..

[56]  J. Pleiss,et al.  Structural classification by the Lipase Engineering Database: a case study of Candida antarctica lipase A , 2010, BMC Genomics.

[57]  Philip A. Romero,et al.  Exploring protein fitness landscapes by directed evolution , 2009, Nature Reviews Molecular Cell Biology.

[58]  Akinori Sarai,et al.  ProTherm, Thermodynamic Database for Proteins and Mutants: developments in version 3.0 , 2002, Nucleic Acids Res..

[59]  Hassan A. Karimi,et al.  High-throughput modeling and analysis of protein structural dynamics , 2007, Briefings Bioinform..

[60]  Ruth Nussinov,et al.  HingeProt: Automated prediction of hinges in protein structures , 2008, Proteins.

[61]  M. W. van der Kamp,et al.  Computational enzymology: insight into biological catalysts from modelling. , 2008, Natural product reports.

[62]  A. Warshel,et al.  Predicting drug-resistant mutations of HIV protease. , 2008, Angewandte Chemie.

[63]  Alessandra Carbone,et al.  Joint Evolutionary Trees: A Large-Scale Method To Predict Protein Interfaces Based on Sequence Sampling , 2009, PLoS Comput. Biol..

[64]  J. P. Grossman,et al.  Biomolecular simulation: a computational microscope for molecular biology. , 2012, Annual review of biophysics.

[65]  U. Schwaneberg,et al.  MAP(2.0)3D: a sequence/structure based server for protein engineering. , 2012, ACS synthetic biology.

[66]  Ruth Nussinov,et al.  Enzyme dynamics point to stepwise conformational selection in catalysis. , 2010, Current opinion in chemical biology.

[67]  Jürgen Pleiss,et al.  The PHA Depolymerase Engineering Database: A systematic analysis tool for the diverse family of polyhydroxyalkanoate (PHA) depolymerases , 2008, BMC Bioinformatics.

[68]  Amol V. Shivange,et al.  dRTP and dPTP a complementary nucleotide couple for the Sequence Saturation Mutagenesis (SeSaM) method , 2012 .

[69]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[70]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[71]  Dusanka Janezic,et al.  ProBiS-2012: web server and web services for detection of structurally similar binding sites in proteins , 2012, Nucleic Acids Res..

[72]  K. Teilum,et al.  Functional aspects of protein flexibility , 2009, Cellular and Molecular Life Sciences.

[73]  Alfonso Jaramillo,et al.  Challenges in the computational design of proteins , 2009, Journal of The Royal Society Interface.

[74]  Arieh Warshel,et al.  Towards Quantitative Computer‐Aided Studies of Enzymatic Enantioselectivity: The Case of Candida antarctica Lipase A , 2012, Chembiochem : a European journal of chemical biology.

[75]  Arieh Warshel,et al.  Enzyme millisecond conformational dynamics do not catalyze the chemical step , 2009, Proceedings of the National Academy of Sciences.

[76]  Antonín Pavelka,et al.  HotSpot Wizard: a web server for identification of hot spots in protein engineering , 2009, Nucleic Acids Res..

[77]  Iosif I. Vaisman,et al.  Accurate prediction of stability changes in protein mutants by combining machine learning with structure based computational mutagenesis , 2008, Bioinform..

[78]  Frances H Arnold,et al.  Directed enzyme evolution: climbing fitness peaks one amino acid at a time. , 2009, Current opinion in chemical biology.

[79]  Dan S. Tawfik,et al.  Stability effects of mutations and protein evolvability. , 2009, Current opinion in structural biology.

[80]  Akinori Sarai,et al.  ProTherm, version 4.0: thermodynamic database for proteins and mutants , 2004, Nucleic Acids Res..

[81]  Sung Ho Ryu,et al.  ConPlex: a server for the evolutionary conservation analysis of protein complex structures , 2010, Nucleic Acids Res..

[82]  Ozlem Keskin,et al.  HotSprint: database of computational hot spots in protein interfaces , 2007, Nucleic Acids Res..

[83]  T. Schlick,et al.  Biomolecular modeling and simulation: a field coming of age , 2011, Quarterly Reviews of Biophysics.

[84]  Bert L de Groot,et al.  Geometry-based sampling of conformational transitions in proteins. , 2007, Structure.

[85]  Frances H. Arnold,et al.  In the Light of Evolution III: Two Centuries of Darwin Sackler Colloquium: In the light of directed evolution: Pathways of adaptive protein evolution , 2009 .

[86]  Martin Zacharias,et al.  A statistical analysis of random mutagenesis methods used for directed protein evolution. , 2006, Journal of molecular biology.

[87]  Adrian J Mulholland,et al.  Taking Ockham's razor to enzyme dynamics and catalysis. , 2012, Nature chemistry.

[88]  Piero Fariselli,et al.  I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure , 2005, Nucleic Acids Res..

[89]  Csaba Magyar,et al.  Locating the stabilizing residues in (α/β)8 barrel proteins based on hydrophobicity, long‐range interactions, and sequence conservation , 2004 .

[90]  Uwe T Bornscheuer,et al.  Finding better protein engineering strategies. , 2009, Nature chemical biology.

[91]  Sungsam Gong,et al.  Structural and functional restraints in the evolution of protein families and superfamilies. , 2009, Biochemical Society transactions.

[92]  Liang-Tsung Huang,et al.  Reliable prediction of protein thermostability change upon double mutation from amino acid sequence , 2009, Bioinform..

[93]  Jan Marienhagen,et al.  Advances in generating functional diversity for directed protein evolution. , 2009, Current Opinion in Chemical Biology.

[94]  Akinori Sarai,et al.  ProTherm, version 2.0: thermodynamic database for proteins and mutants , 2000, Nucleic Acids Res..

[95]  Nir Ben-Tal,et al.  The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures , 2008, Nucleic Acids Res..

[96]  Kaare Teilum,et al.  Protein stability, flexibility and function. , 2011, Biochimica et biophysica acta.

[97]  F. Arnold,et al.  Optimizing industrial enzymes by directed evolution. , 1997, Advances in biochemical engineering/biotechnology.

[98]  Quan K Thai,et al.  SHV Lactamase Engineering Database: a reconciliation tool for SHV β-lactamases in public databases , 2010, BMC Genomics.

[99]  Stefan Lutz,et al.  Beyond directed evolution--semi-rational protein engineering and design. , 2010, Current opinion in biotechnology.

[100]  J. M. Sanchez-Ruiz,et al.  Protein kinetic stability. , 2010, Biophysical chemistry.

[101]  Steven Hayward,et al.  A method for the analysis of domain movements in large biomolecular complexes , 2009, Proteins.

[102]  Jürgen Pleiss,et al.  The database of epoxide hydrolases and haloalkane dehalogenases: one structure, many functions , 2004, Bioinform..

[103]  Modesto Orozco,et al.  FlexServ: an integrated tool for the analysis of protein flexibility , 2009, Bioinform..

[104]  Roberto Sanchez,et al.  SiteComp: a server for ligand binding site analysis in protein structures , 2012, Bioinform..

[105]  Steven Hayward,et al.  Database of ligand-induced domain movements in enzymes , 2009, BMC Structural Biology.

[106]  Ian W. Davis,et al.  The backrub motion: how protein backbone shrugs when a sidechain dances. , 2006, Structure.

[107]  I. Bahar,et al.  Coarse-grained normal mode analysis in structural biology. , 2005, Current opinion in structural biology.

[108]  Shina Caroline Lynn Kamerlin,et al.  Computational Protein Engineering: Bridging the Gap between Rational Design and Laboratory Evolution , 2012, International journal of molecular sciences.

[109]  Tuck Seng Wong,et al.  Steering directed protein evolution: strategies to manage combinatorial complexity of mutant libraries. , 2007, Environmental microbiology.

[110]  Roberto A Chica,et al.  Semi-rational approaches to engineering enzyme activity: combining the benefits of directed evolution and rational design. , 2005, Current opinion in biotechnology.

[111]  Michael A. Johnston,et al.  Integrated prediction of the effect of mutations on multiple protein characteristics , 2011, Proteins.

[112]  François Stricher,et al.  How Protein Stability and New Functions Trade Off , 2008, PLoS Comput. Biol..

[113]  I. Bahar,et al.  Global dynamics of proteins: bridging between structure and function. , 2010, Annual review of biophysics.

[114]  R. Jernigan,et al.  Anisotropy of fluctuation dynamics of proteins with an elastic network model. , 2001, Biophysical journal.

[115]  Arlo Z. Randall,et al.  Prediction of protein stability changes for single‐site mutations using support vector machines , 2005, Proteins.

[116]  Robert Kourist,et al.  The α/β‐Hydrolase Fold 3DM Database (ABHDB) as a Tool for Protein Engineering , 2010, Chembiochem : a European journal of chemical biology.

[117]  D. Kern,et al.  Dynamic personalities of proteins , 2007, Nature.

[118]  Andrew E. Firth,et al.  GLUE-IT and PEDEL-AA: new programmes for analyzing protein diversity in randomized libraries , 2008, Nucleic Acids Res..

[119]  Jürgen Pleiss,et al.  The Laccase Engineering Database: a classification and analysis system for laccases and related multicopper oxidases , 2011, Database J. Biol. Databases Curation.

[120]  J. Pei,et al.  Multiple protein sequence alignment. , 2008, Current opinion in structural biology.

[121]  Vittorio Scarano,et al.  COCOMAPS: a web application to analyze and visualize contacts at the interface of biomolecular complexes , 2011, Bioinform..

[122]  Harri Savilahti,et al.  Critical evaluation of random mutagenesis by error-prone polymerase chain reaction protocols, Escherichia coli mutator strain, and hydroxylamine treatment. , 2009, Analytical biochemistry.