QMEAN: A comprehensive scoring function for model quality assessment

In protein structure prediction, a considerable number of alternative models are usually produced from which subsequently the final model has to be selected. Thus, a scoring function for the identification of the best model within an ensemble of alternative models is a key component of most protein structure prediction pipelines. QMEAN, which stands for Qualitative Model Energy ANalysis, is a composite scoring function describing the major geometrical aspects of protein structures. Five different structural descriptors are used. The local geometry is analyzed by a new kind of torsion angle potential over three consecutive amino acids. A secondary structure‐specific distance‐dependent pairwise residue‐level potential is used to assess long‐range interactions. A solvation potential describes the burial status of the residues. Two simple terms describing the agreement of predicted and calculated secondary structure and solvent accessibility, respectively, are also included. A variety of different implementations are investigated and several approaches to combine and optimize them are discussed. QMEAN was tested on several standard decoy sets including a molecular dynamics simulation decoy set as well as on a comprehensive data set of totally 22,420 models from server predictions for the 95 targets of CASP7. In a comparison to five well‐established model quality assessment programs, QMEAN shows a statistically significant improvement over nearly all quality measures describing the ability of the scoring function to identify the native structure and to discriminate good from bad models. The three‐residue torsion angle potential turned out to be very effective in recognizing the native fold. Proteins 2008. © 2007 Wiley‐Liss, Inc.

[1]  A. Finkelstein,et al.  Residue-residue mean-force potentials for protein structure recognition. , 1997, Protein engineering.

[2]  C. Sander,et al.  Evaluation of protein models by atomic solvation preference. , 1992, Journal of molecular biology.

[3]  M. Levitt,et al.  Energy functions that discriminate X-ray and near native folds from well-constructed decoys. , 1996, Journal of molecular biology.

[4]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[5]  S. Tosatto,et al.  Application of MM/PBSA colony free energy to loop decoy discrimination: Toward correlation between energy and root mean square deviation , 2005, Protein science : a publication of the Protein Society.

[6]  K. Dill,et al.  Statistical potentials extracted from protein structures: how accurate are they? , 1996, Journal of molecular biology.

[7]  A. Elofsson,et al.  Can correct protein models be identified? , 2003, Protein science : a publication of the Protein Society.

[8]  M. Karplus,et al.  Effective energy functions for protein structure prediction. , 2000, Current opinion in structural biology.

[9]  R Samudrala,et al.  Ab initio construction of protein tertiary structures using a hierarchical approach. , 2000, Journal of molecular biology.

[10]  R L Jernigan,et al.  Identifying sequence-structure pairs undetected by sequence alignments. , 2000, Protein engineering.

[11]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[12]  A. Finkelstein,et al.  Why do protein architectures have boltzmann‐like statistics? , 1995, Proteins.

[13]  J. Skolnick,et al.  Automated structure prediction of weakly homologous proteins on a genomic scale. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[15]  Paul W. Fitzjohn,et al.  In silico protein recombination: enhancing template and sequence alignment selection for comparative protein modelling. , 2003, Journal of molecular biology.

[16]  Silvio C. E. Tosatto,et al.  TAP score: torsion angle propensity normalization applied to local protein structure evaluation , 2007, BMC Bioinformatics.

[17]  Silvio C. E. Tosatto,et al.  The Victor/FRST Function for Model Quality Estimation , 2005, J. Comput. Biol..

[18]  David C. Jones,et al.  GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. , 1999, Journal of molecular biology.

[19]  Federico Fogolari,et al.  Protocol for MM/PBSA molecular dynamics simulations of proteins. , 2003, Biophysical journal.

[20]  András Fiser,et al.  Molecular Biophysics , 2022 .

[21]  M. Karplus,et al.  Discrimination of the native from misfolded protein models with an energy function including implicit solvation. , 1999, Journal of molecular biology.

[22]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[23]  D Gilis,et al.  Predicting protein stability changes upon mutation using database-derived potentials: solvent accessibility determines the importance of local versus non-local interactions along the sequence. , 1997, Journal of molecular biology.

[24]  A. Sali,et al.  Statistical potentials for fold assessment , 2009 .

[25]  Narayanan Eswar,et al.  High-throughput computational and experimental techniques in structural genomics. , 2004, Genome research.

[26]  R. Elber,et al.  Distance‐dependent, pair potential for protein folding: Results from linear optimization , 2000, Proteins.

[27]  Richard Bonneau,et al.  An improved protein decoy set for testing energy functions for protein structure prediction , 2003, Proteins.

[28]  E S Huang,et al.  Factors affecting the ability of energy functions to discriminate correct from incorrect folds. , 1997, Journal of molecular biology.

[29]  R. Samudrala,et al.  An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. , 1998, Journal of molecular biology.

[30]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[31]  M. Levitt,et al.  A novel approach to decoy set generation: designing a physical energy function having local minima with native structure characteristics. , 2003, Journal of molecular biology.

[32]  A. Kolinski,et al.  Derivation of protein‐specific pair potentials based on weak sequence fragment similarity , 2000, Proteins.

[33]  B. Rost,et al.  Critical assessment of methods of protein structure prediction (CASP)—Round 6 , 2005 .

[34]  M J Rooman,et al.  Are database-derived potentials valid for scoring both forward and inverted protein folding? , 1995, Protein engineering.

[35]  Giorgio Valle,et al.  Simple consensus procedures are effective and sufficient in secondary structure prediction. , 2003, Protein engineering.

[36]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[37]  Alessandro Albiero,et al.  Fine-grained statistical torsion angle potentials are effective in discriminating native protein structures. , 2006, Current drug discovery technologies.

[38]  J Moult,et al.  Comparison of database potentials and molecular mechanics force fields. , 1997, Current opinion in structural biology.

[39]  Liam J. McGuffin,et al.  Improving sequence-based fold recognition by using 3D model quality assessment , 2005, Bioinform..

[40]  D Gilis,et al.  Stability changes upon mutation of solvent-accessible residues in proteins evaluated by database-derived potentials. , 1996, Journal of molecular biology.

[41]  M. Sippl Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. , 1990, Journal of molecular biology.

[42]  M. Levitt,et al.  Improved protein structure selection using decoy-dependent discriminatory functions , 2004, BMC Structural Biology.

[43]  A. Sali,et al.  Comparative protein structure modeling by iterative alignment, model building and model assessment. , 2003, Nucleic Acids Research.

[44]  J. Skolnick,et al.  Local propensities and statistical potentials of backbone dihedral angles in proteins. , 2004, Journal of molecular biology.

[45]  Silvio C. E. Tosatto,et al.  A decoy set for the thermostable subdomain from chicken villin headpiece, comparison of different free energy estimators , 2005, BMC Bioinformatics.

[46]  M. Sippl,et al.  Detection of native‐like models for amino acid sequences of unknown three‐dimensional structure in a data base of known protein conformations , 1992, Proteins.

[47]  S. Wodak,et al.  Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches. , 1994, Journal of molecular biology.

[48]  A. Sali,et al.  Statistical potential for assessment and prediction of protein structures , 2006, Protein science : a publication of the Protein Society.

[49]  J. Skolnick,et al.  A distance‐dependent atomic knowledge‐based potential for improved protein structure selection , 2001, Proteins.

[50]  Marc A. Martí-Renom,et al.  EVA: continuous automatic evaluation of protein structure prediction servers , 2001, Bioinform..

[51]  Burkhard Rost,et al.  Prediction in 1D: secondary structure, membrane helices, and accessibility. , 2003, Methods of biochemical analysis.

[52]  F. Melo,et al.  Novel knowledge-based mean force potential at atomic level. , 1997, Journal of molecular biology.

[53]  A. Sali,et al.  A composite score for predicting errors in protein structure models , 2006, Protein science : a publication of the Protein Society.

[54]  Stefano Toppo,et al.  Improving the quality of protein structure models by selecting from alignment alternatives , 2006, BMC Bioinform..

[55]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[56]  B. Rost Review: protein secondary structure prediction continues to rise. , 2001, Journal of structural biology.

[57]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[58]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[59]  Hongyi Zhou,et al.  An accurate, residue‐level, pair potential of mean force for folding and binding based on the distance‐scaled, ideal‐gas reference state , 2004, Protein science : a publication of the Protein Society.

[60]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[61]  Daniel Fischer,et al.  Servers for protein structure prediction. , 2006, Current opinion in structural biology.

[62]  Arne Elofsson,et al.  3D-Jury: A Simple Approach to Improve Protein Structure Predictions , 2003, Bioinform..

[63]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[64]  R Samudrala,et al.  Decoys ‘R’ Us: A database of incorrect conformations to improve protein structure prediction , 2000, Protein science : a publication of the Protein Society.

[65]  Pierre Baldi,et al.  SCRATCH: a protein structure and structural feature prediction server , 2005, Nucleic Acids Res..

[66]  Dietmar Schomburg,et al.  Structural analysis and prediction of protein mutant stability using distance and torsion potentials: Role of secondary structure and solvent accessibility , 2006, Proteins.

[67]  D. Shortle Propensities, probabilities, and the Boltzmann hypothesis , 2003, Protein science : a publication of the Protein Society.

[68]  G. N. Ramachandran,et al.  Stereochemistry of polypeptide chain configurations. , 1963, Journal of molecular biology.

[69]  R. Jernigan,et al.  Inter-residue potentials in globular proteins and the dominance of highly specific hydrophilic interactions at close separation. , 1997, Journal of molecular biology.

[70]  D. Shortle Composites of local structure propensities: evidence for local encoding of long-range structure. , 2002, Protein science : a publication of the Protein Society.

[71]  Dietmar Schomburg,et al.  Prediction of protein thermostability with a direction‐ and distance‐dependent knowledge‐based potential , 2005, Protein science : a publication of the Protein Society.

[72]  Manfred J. Sippl,et al.  Boltzmann's principle, knowledge-based mean fields and protein folding. An approach to the computational determination of protein structures , 1993, J. Comput. Aided Mol. Des..