Assessment of predictions in the model quality assessment category

The article presents our evaluation of the predictions submitted to the model quality assessment (QA) category in CASP7. In this newly introduced category, predictors were asked to provide quality estimates for protein structure models. The QA category uses the automatically produced models that are traditionally distributed to CASP participants as input for predictions. Predictors were asked to provide an index of the quality of these individual models (QM1) as well as an index for the expected correctness of each of their residues (QM2). We computed the correlation between the observed and predicted quality of the models and of the individual residues achieved by the participating groups and evaluated the statistical significance of the differences. We also compared the results with those obtained by a “naïve predictor” that assigns a quality score related to how close the model is to the structure of the most similar protein of known structure. The aims of a method for assessing the overall quality of a model can be twofold: selecting the best (or one of the best) model(s) among a set of plausible choices, or assigning a nonrelative quality value to an individual model. The applications of the two strategies are different, albeit equally important. Our assessment of the QA category demonstrates that methods for addressing the first task effectively do exist, while there is room for improvement as far as the second aspect is concerned. Notwithstanding the limited number of groups submitting predictions for residue‐level accuracy, our data demonstrate that a respectable accuracy in this task can be achieved by methods relying on the comparison of different models for the same target. Proteins 2007. © 2007 Wiley‐Liss, Inc.

[1]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[2]  R Leplae,et al.  Analysis and assessment of comparative modeling predictions in CASP4 , 2001, Proteins.

[3]  Ceslovas Venclovas,et al.  Progress over the first decade of CASP experiments , 2005, Proteins.

[4]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[5]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[6]  Chin-Hsien Tai,et al.  Assessment of CASP6 predictions for new and nearly new fold targets , 2005, Proteins.

[7]  Krzysztof Fidelis,et al.  Progress from CASP6 to CASP7 , 2007, Proteins.

[8]  Alfonso Valencia,et al.  Assessment of predictions submitted for the CASP6 comparative modeling category , 2005, Proteins.

[9]  Anna Tramontano,et al.  The PMDB Protein Model Database , 2005, Nucleic Acids Res..

[10]  Torsten Schwede,et al.  The SWISS-MODEL Repository: new features and functionalities , 2005, Nucleic Acids Res..

[11]  Randy J Read,et al.  Domain definition and target classification for CASP7 , 2007, Proteins.

[12]  Anna Tramontano,et al.  Evaluating the usefulness of protein structure models for molecular replacement , 2005, ECCB/JBI.

[13]  Marc A. Martí-Renom,et al.  MODBASE: a database of annotated comparative protein structure models and associated resources , 2005, Nucleic Acids Res..

[14]  Anna Tramontano,et al.  Assessment of homology‐based predictions in CASP5 , 2003, Proteins.

[15]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[16]  Arne Elofsson,et al.  Prediction of global and local model quality in CASP7 using Pcons and ProQ , 2007, Proteins.

[17]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[18]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[19]  Torsten Schwede,et al.  Assessment of CASP7 predictions for template‐based modeling targets , 2007, Proteins.

[20]  A Tramontano,et al.  Molecular model of the specificity pocket of the hepatitis C virus protease: implications for substrate recognition. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[21]  D. Cozzetto,et al.  Relationship between multiple sequence alignments and quality of protein comparative models , 2004, Proteins.

[22]  Roland L Dunbrack,et al.  Assessment of fold recognition predictions in CASP6 , 2005, Proteins.

[23]  Prasanna R Kolatkar,et al.  Assessment of CASP7 structure predictions for template free targets , 2007, Proteins.

[24]  Krzysztof Fidelis,et al.  Progress from CASP 6 to CASP 7 , 2007 .