Assessing computational methods for predicting protein stability upon mutation: good on average but not in the details.

Methods for protein modeling and design advanced rapidly in recent years. At the heart of these computational methods is an energy function that calculates the free energy of the system. Many of these functions were also developed to estimate the consequence of mutation on protein stability or binding affinity. In the current study, we chose six different methods that were previously reported as being able to predict the change in protein stability (DeltaDeltaG) upon mutation: CC/PBSA, EGAD, FoldX, I-Mutant2.0, Rosetta and Hunter. We evaluated their performance on a large set of 2156 single mutations, avoiding for each program the mutations used for training. The correlation coefficients between experimental and predicted DeltaDeltaG values were in the range of 0.59 for the best and 0.26 for the worst performing method. All the tested computational methods showed a correct trend in their predictions, but failed in providing the precise values. This is not due to lack in precision of the experimental data, which showed a correlation coefficient of 0.86 between different measurements. Combining the methods did not significantly improve prediction accuracy compared to a single method. These results suggest that there is still room for improvement, which is crucial if we want forcefields to perform better in their various tasks.

[1]  B. L. de Groot,et al.  Predicting free energy changes using structural ensembles. , 2009, Nature methods.

[2]  M. Sternberg,et al.  Protein structure prediction on the Web: a case study using the Phyre server , 2009, Nature Protocols.

[3]  Gideon Schreiber,et al.  Similar chemistry, but different bond preferences in inter versus intra‐protein interactions , 2008, Proteins.

[4]  Toni Cathomen,et al.  Structure-based redesign of the dimerization interface reduces the toxicity of zinc-finger nucleases , 2007, Nature Biotechnology.

[5]  Gideon Schreiber,et al.  The molecular architecture of protein-protein binding sites. , 2005, Current opinion in structural biology.

[6]  Akinori Sarai,et al.  ProTherm and ProNIT: thermodynamic databases for proteins and protein–nucleic acid interactions , 2005, Nucleic Acids Res..

[7]  Piero Fariselli,et al.  I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure , 2005, Nucleic Acids Res..

[8]  大房 健 基礎講座 電気泳動(Electrophoresis) , 2005 .

[9]  N. Pokala,et al.  Energy functions for protein design: adjustment with protein-protein complex affinities, models for the unfolded state, and negative design of solubility and specificity. , 2005, Journal of molecular biology.

[10]  David Baker,et al.  Protein structure prediction and analysis using the Robetta server , 2004, Nucleic Acids Res..

[11]  D. Baker,et al.  Close agreement between the orientation dependence of hydrogen bonds observed in protein structures and quantum mechanical calculations. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[12]  D. Baker,et al.  Computational redesign of protein-protein interaction specificity , 2004, Nature Structural &Molecular Biology.

[13]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[14]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[15]  Niles A Pierce,et al.  Protein design is NP-hard. , 2002, Protein engineering.

[16]  Roland L. Dunbrack Rotamer libraries in the 21st century. , 2002, Current opinion in structural biology.

[17]  L. Serrano,et al.  Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. , 2002, Journal of molecular biology.

[18]  Gideon Schreiber,et al.  Rational design of faster associating and tighter binding protein complexes , 2000, Nature Structural Biology.

[19]  Christopher A. Voigt,et al.  Trading accuracy for speed: A quantitative comparison of search algorithms in protein sequence design. , 2000, Journal of molecular biology.

[20]  M. Karplus,et al.  Effective energy functions for protein structure prediction. , 2000, Current opinion in structural biology.

[21]  A. R. Fresht Structure and Mechanism in Protein Science: A Guide to Enzyme Catalysis and Protein Folding , 1999 .

[22]  G Schreiber,et al.  Thermodynamics of the interaction of barnase and barstar: changes in free energy versus changes in enthalpy on mutation. , 1997, Journal of molecular biology.

[23]  N. Guex,et al.  SWISS‐MODEL and the Swiss‐Pdb Viewer: An environment for comparative protein modeling , 1997, Electrophoresis.

[24]  Peter A. Kollman,et al.  AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules , 1995 .

[25]  P. Privalov,et al.  Energetics of protein structure. , 1995, Advances in protein chemistry.

[26]  B. Matthews,et al.  The role of backbone flexibility in the accommodation of variants that repack the core of T4 lysozyme. , 1994, Science.

[27]  A. G. Day,et al.  Step-wise mutation of barnase to binase. A procedure for engineering increased stability of proteins and an experimental analysis of the evolution of protein stability. , 1993, Journal of molecular biology.

[28]  M J Sternberg,et al.  Empirical scale of side-chain conformational entropy in protein folding. , 1993, Journal of molecular biology.

[29]  B. Matthews,et al.  Structural and genetic analysis of protein stability. , 1993, Annual review of biochemistry.

[30]  N. Oppenheimer,et al.  Structure and mechanism , 1989 .

[31]  R. Jernigan,et al.  Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation , 1985 .

[32]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[33]  H. Scheraga,et al.  Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. , 1976, Macromolecules.

[34]  C. Tanford Macromolecules , 1994, Nature.