MUFOLD‐WQA: A new selective consensus method for quality assessment in protein structure prediction

Assessing the quality of predicted models is essential in protein tertiary structure prediction. In the past critical assessment of techniques for protein structure prediction (CASP) experiments, consensus quality assessment (QA) methods have shown to be very effective, outperforming single‐model methods and other competing approaches by a large margin. In the consensus QA approach, the quality score of a model is typically estimated based on pair‐wise structure similarity of it to a set of reference models. In CASP8, the differences among the top QA servers were mostly in the selection of the reference models. In this article, we present a new consensus method “SelCon” based on two key ideas: (1) to adaptively select appropriate reference models based on the attributes of the whole set of predicted models and (2) to weigh different reference models differently, and in particular not to use models that are too similar or too different from the candidate model as its references. We have developed several reference selection functions in SelCon and obtained improved QA results over existing QA methods in experiments using CASP7 and CASP8 data. In the recently completed CASP9 in 2010, the new method was implemented in our MUFOLD‐WQA server. Both the official CASP9 assessment and our in‐house evaluation showed that MUFOLD‐WQA performed very well and achieved top performances in both the global structure QA and top‐model selection category in CASP9. Proteins 2011; © 2011 Wiley‐Liss, Inc.

[1]  Johannes Söding,et al.  The HHpred interactive server for protein homology detection and structure prediction , 2005, Nucleic Acids Res..

[2]  Anna Tramontano,et al.  Assessment of predictions in the model quality assessment category , 2007, Proteins.

[3]  Kevin Karplus,et al.  Applying Undertaker to quality assessment , 2009, Proteins.

[4]  A. Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP)—round IX , 2011, Proteins.

[5]  K. Wüthrich,et al.  Torsion angle dynamics for NMR structure calculation with the new program DYANA. , 1997, Journal of molecular biology.

[6]  W. C. Still,et al.  Semianalytical treatment of solvation for molecular mechanics and dynamics , 1990 .

[7]  David Baker,et al.  Ranking predicted protein structures with support vector regression , 2007, Proteins.

[8]  B. Rost,et al.  Critical assessment of methods of protein structure prediction—Round VIII , 2009, Proteins.

[9]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[10]  SödingJohannes Protein homology detection by HMM--HMM comparison , 2005 .

[11]  J J Burbaum,et al.  Understanding structural relationships proteins of unsolved three‐dimensional structure , 1990, Proteins.

[12]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[13]  Qingguo Wang,et al.  Improving a Consensus Approach for Protein Structure Selection by Removing Redundancy , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[14]  Jimin Pei,et al.  Analysis of CASP8 targets, predictions and assessment methods , 2009, Database J. Biol. Databases Curation.

[15]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[16]  Jianlin Cheng,et al.  Prediction of global and local quality of CASP8 models by MULTICOM series , 2009, Proteins.

[17]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[18]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[19]  Qingguo Wang,et al.  MUFOLD: A new solution for protein 3D structure prediction , 2010, Proteins.

[20]  Qingguo Wang,et al.  A new clustering-based method for protein structure selection , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[21]  Liam J. McGuffin,et al.  Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments , 2010, Bioinform..

[22]  Yaoqi Zhou,et al.  Ab initio folding of terminal segments with secondary structures reveals the fine difference between two closely related all‐atom statistical energy functions , 2008, Protein science : a publication of the Protein Society.

[23]  David Baker,et al.  Protein structure prediction and analysis using the Robetta server , 2004, Nucleic Acids Res..

[24]  Jinbo Xu,et al.  A multiple‐template approach to protein threading , 2011, Proteins.

[25]  R. Samudrala,et al.  An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. , 1998, Journal of molecular biology.

[26]  Jianpeng Ma,et al.  OPUS-PSP: an orientation-dependent statistical all-atom potential derived from side-chain packing. , 2008, Journal of molecular biology.

[27]  P. Kollman,et al.  Pathways to a protein folding intermediate observed in a 1-microsecond simulation in aqueous solution. , 1998, Science.

[28]  Anna Tramontano,et al.  Evaluation of CASP8 model quality predictions , 2009, Proteins.

[29]  Jeffrey Skolnick,et al.  Finding the needle in a haystack: educing native folds from ambiguous ab initio protein structure predictions , 2001, J. Comput. Chem..

[30]  Jaime Prilusky,et al.  Assessment of CASP8 structure predictions for template free targets , 2009, Proteins.

[31]  G. Klebe,et al.  Knowledge-based scoring function to predict protein-ligand interactions. , 2000, Journal of molecular biology.

[32]  Prasanna R Kolatkar,et al.  Assessment of CASP7 structure predictions for template free targets , 2007, Proteins.

[33]  Qingguo Wang,et al.  Protein structure selection based on consensus , 2010, IEEE Congress on Evolutionary Computation.

[34]  Jianpeng Ma,et al.  OPUS‐Ca: A knowledge‐based potential function requiring only Cα positions , 2007, Protein science : a publication of the Protein Society.

[35]  Silvio C. E. Tosatto,et al.  Global and local model quality estimation at CASP8 using the scoring functions QMEAN and QMEANclust , 2009, Proteins.

[36]  Arne Elofsson,et al.  3D-Jury: A Simple Approach to Improve Protein Structure Predictions , 2003, Bioinform..

[37]  A. Sali,et al.  Statistical potential for assessment and prediction of protein structures , 2006, Protein science : a publication of the Protein Society.

[38]  D. Baker,et al.  Clustering of low-energy conformations near the native structures of small proteins. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[39]  Jianlin Cheng,et al.  Evaluating the absolute quality of a single protein model using structural features and support vector machines , 2009, Proteins.

[40]  Torsten Schwede,et al.  The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling , 2006, Bioinform..

[41]  Pascal Benkert,et al.  QMEAN: A comprehensive scoring function for model quality assessment , 2008, Proteins.

[42]  Yang Zhang,et al.  SPICKER: A clustering approach to identify near‐native protein folds , 2004, J. Comput. Chem..

[43]  Wei Zhang,et al.  A point‐charge force field for molecular mechanics simulations of proteins based on condensed‐phase quantum mechanical calculations , 2003, J. Comput. Chem..

[44]  Hongyi Zhou,et al.  What is a desirable statistical energy functions for proteins and how can it be obtained? , 2007, Cell Biochemistry and Biophysics.

[45]  Arne Elofsson,et al.  Assessment of global and local model quality in CASP8 using Pcons and ProQ , 2009, Proteins.

[46]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[47]  Jianpeng Ma,et al.  OPUS-Ca: a knowledge-based potential function requiring only Calpha positions. , 2007, Protein science : a publication of the Protein Society.

[48]  C A Floudas,et al.  Computational methods in protein structure prediction. , 2007, Biotechnology and bioengineering.

[49]  J. Skolnick,et al.  A distance‐dependent atomic knowledge‐based potential for improved protein structure selection , 2001, Proteins.

[50]  Liam J. McGuffin Prediction of global and local model quality in CASP8 using the ModFOLD server , 2009, Proteins.

[51]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[52]  Iakes Ezkurdia,et al.  Target domain definition and classification in CASP8 , 2009, Proteins.

[53]  Arne Elofsson,et al.  MaxSub: an automated measure for the assessment of protein structure prediction quality , 2000, Bioinform..

[54]  Yang Zhang,et al.  Template‐based modeling and free modeling by I‐TASSER in CASP7 , 2007, Proteins.

[55]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction—Round VII , 2007, Proteins.

[56]  Č. Venclovas,et al.  Comparative modeling in CASP6 using consensus approach to template selection, sequence‐structure alignment, and structure assessment , 2005, Proteins.