Evaluation of model quality predictions in CASP9

CASP has been assessing the state of the art in the a priori estimation of accuracy of protein structure prediction since 2006. The inclusion of model quality assessment category in CASP contributed to a rapid development of methods in this area. In the last experiment, 46 quality assessment groups tested their approaches to estimate the accuracy of protein models as a whole and/or on a per‐residue basis. We assessed the performance of these methods predominantly on the basis of the correlation between the predicted and observed quality of the models on both global and local scales. The ability of the methods to identify the models closest to the best one, to differentiate between good and bad models, and to identify well modeled regions was also analyzed. Our evaluations demonstrate that even though global quality assessment methods seem to approach perfection point (weighted average per‐target Pearson's correlation coefficients are as high as 0.97 for the best groups), there is still room for improvement. First, all top‐performing methods use consensus approaches to generate quality estimates, and this strategy has its own limitations. Second, the methods that are based on the analysis of individual models lag far behind clustering techniques and need a boost in performance. The methods for estimating per‐residue accuracy of models are less accurate than global quality assessment methods, with an average weighted per‐model correlation coefficient in the range of 0.63–0.72 for the best 10 groups. Proteins 2011; © 2011 Wiley‐Liss, Inc.

[1]  N. Grishin,et al.  CASP9 target classification , 2011, Proteins.

[2]  Frank von Delft,et al.  Molecular replacement , 2007, Acta Crystallographica Section D: Biological Crystallography.

[3]  Edmond Godfroid,et al.  Distantly related lipocalins share two conserved clusters of hydrophobic residues: use in homology modeling. , 2008, BMC structural biology.

[4]  Č. Venclovas,et al.  Essential roles for imuA′- and imuB-encoded accessory factors in DnaE2-dependent mutagenesis in Mycobacterium tuberculosis , 2010, Proceedings of the National Academy of Sciences.

[5]  Roland L Dunbrack,et al.  Outcome of a workshop on applications of protein models in biomedical research. , 2009, Structure.

[6]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[7]  Anna Tramontano,et al.  Assessment of predictions in the model quality assessment category , 2007, Proteins.

[8]  Maliha S. Nash,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 2001, Technometrics.

[9]  Č. Venclovas,et al.  Generation of DNA cleavage specificities of type II restriction endonucleases by reassortment of target recognition domains , 2007, Proceedings of the National Academy of Sciences.

[10]  Nir Ben-Tal,et al.  Quality assessment of protein model-structures using evolutionary conservation , 2010, Bioinform..

[11]  Anna Tramontano,et al.  The role of molecular modelling in biomedical research , 2006, FEBS letters.

[12]  L. Montero-Cabrera,et al.  In silico study of the human rhodopsin and meta rhodopsin II/S‐arrestin complexes: Impact of single point mutations related to retina degenerative diseases , 2008, Proteins.

[13]  D. Eisenberg,et al.  VERIFY3D: assessment of protein models with three-dimensional profiles. , 1997, Methods in enzymology.

[14]  Jaroslaw Meller,et al.  Fast Geometric Consensus Approach for Protein Model Quality Assessment , 2011, J. Comput. Biol..

[15]  Jianlin Cheng,et al.  Evaluating the absolute quality of a single protein model using structural features and support vector machines , 2009, Proteins.

[16]  Liam J. McGuffin,et al.  Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments , 2010, Bioinform..

[17]  Maurizio Botta,et al.  Protein Kinases: Docking and Homology Modeling Reliability , 2010, J. Chem. Inf. Model..

[18]  Michal Brylinski,et al.  Comprehensive Structural and Functional Characterization of the Human Kinome by Protein Structure Modeling and Ligand Virtual Screening , 2010, J. Chem. Inf. Model..

[19]  Adam Godzik,et al.  Modeling and Analyzing Three-Dimensional Structures of Human Disease Proteins , 2005, Pacific Symposium on Biocomputing.

[20]  P. Bradley,et al.  High-resolution structure prediction and the crystallographic phase problem , 2007, Nature.

[21]  J Lundström,et al.  Pcons: A neural‐network–based consensus predictor that improves fold recognition , 2001, Protein science : a publication of the Protein Society.

[22]  A. Faller,et al.  An Average Correlation Coefficient , 1981 .

[23]  M. Sippl Recognition of errors in three‐dimensional structures of proteins , 1993, Proteins.

[24]  Liam J. McGuffin Prediction of global and local model quality in CASP8 using the ModFOLD server , 2009, Proteins.

[25]  Torsten Schwede,et al.  The SWISS-MODEL Repository and associated resources , 2008, Nucleic Acids Res..

[26]  Lucila Ohno-Machado,et al.  The use of receiver operating characteristic curves in biomedical informatics , 2005, J. Biomed. Informatics.

[27]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[28]  Krzysztof Fidelis,et al.  CASP9 results compared to those of previous casp experiments , 2011, Proteins.

[29]  Bruno D. Zumbo,et al.  Bias in Estimation and Hypothesis Testing of Correlation , 2003 .

[30]  Cristian R. Munteanu,et al.  MIND-BEST: Web server for drugs and target discovery; design, synthesis, and assay of MAO-B inhibitors and theoretical-experimental study of G3PDH protein from Trichomonas gallinae. , 2011, Journal of proteome research.

[31]  Anna Tramontano,et al.  Automatic procedure for using models of proteins in molecular replacement , 2006, Proteins.

[32]  R. Wilcox Fundamentals of Modern Statistical Methods: Substantially Improving Power and Accuracy , 2001 .

[33]  Janusz M. Bujnicki,et al.  MetaMQAP: A meta-server for the quality assessment of protein models , 2008, BMC Bioinformatics.

[34]  Liam J. McGuffin,et al.  The IntFOLD server: an integrated web resource for protein fold recognition, 3D model quality assessment, intrinsic disorder prediction, domain prediction and ligand binding site prediction , 2011, Nucleic Acids Res..

[35]  Arne Elofsson,et al.  A study of quality measures for protein threading models , 2001, BMC Bioinformatics.

[36]  John Moult,et al.  Comparative modeling in structural genomics. , 2008, Structure.

[37]  K Nishikawa,et al.  Assessment of pseudo-energy potentials by the best-five test: a new use of the three-dimensional profiles of proteins. , 1997, Protein engineering.

[38]  Stephen Neidle,et al.  Molecular modeling on inhibitor complexes and active-site dynamics of cytochrome P450 C17, a target for prostate cancer therapy. , 2010, Journal of molecular biology.

[39]  P. Mitrasinovic,et al.  Advances in the structure-based design of the influenza A neuraminidase inhibitors. , 2010, Current drug targets.

[40]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[41]  Marc A. Martí-Renom,et al.  MODBASE: a database of annotated comparative protein structure models and associated resources , 2005, Nucleic Acids Res..

[42]  Arne Elofsson,et al.  Using multiple templates to improve quality of homology models in automated homology modeling , 2008, Protein science : a publication of the Protein Society.

[43]  Rand R. Wilcox,et al.  Fundamentals of Modern Statistical Methods , 2001 .

[44]  A. Engelman,et al.  Structure-based modeling of the functional HIV-1 intasome and its inhibition , 2010, Proceedings of the National Academy of Sciences.

[45]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[46]  Silvio C. E. Tosatto,et al.  Global and local model quality estimation at CASP8 using the scoring functions QMEAN and QMEANclust , 2009, Proteins.

[47]  E. DeLong,et al.  Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. , 1988, Biometrics.

[48]  Daisuke Kihara,et al.  Sub-AQUA: real-value quality assessment of protein structure models. , 2010, Protein engineering, design & selection : PEDS.

[49]  Anna Tramontano,et al.  Evaluation of CASP8 model quality predictions , 2009, Proteins.

[50]  Kuo-Chen Chou,et al.  Designing Inhibitors of M2 Proton Channel against H1N1 Swine Influenza Virus , 2010, PloS one.

[51]  Arne Elofsson,et al.  Assessment of global and local model quality in CASP8 using Pcons and ProQ , 2009, Proteins.

[52]  Anna Tramontano,et al.  The PMDB Protein Model Database , 2005, Nucleic Acids Res..

[53]  K. Fidelis,et al.  Protein structure prediction and model quality assessment. , 2009, Drug discovery today.

[54]  Jianlin Cheng,et al.  Prediction of global and local quality of CASP8 models by MULTICOM series , 2009, Proteins.

[55]  Claudio N. Cavasotto,et al.  Docking-based virtual screening for ligands of G protein-coupled receptors: not only crystal structures but also in silico models. , 2011, Journal of molecular graphics & modelling.

[56]  Andriy Kryshtafovych,et al.  Protein Model Database , 2005 .

[57]  Pascal Benkert,et al.  QMEAN: A comprehensive scoring function for model quality assessment , 2008, Proteins.

[58]  Pascal Benkert,et al.  QMEAN server for protein model quality estimation , 2009, Nucleic Acids Res..