Assessment of template based protein structure predictions in CASP9

In the Ninth Edition of the Critical Assessment of Techniques for Protein Structure Prediction (CASP9), 61,665 models submitted by 176 groups were assessed for their accuracy in the template based modeling category. The models were evaluated numerically in comparison to their experimental control structures using two global measures (GDT and GDC), and a novel local score evaluating the correct modeling of local interactions (lDDT). Overall, the state of the art of template based modeling in CASP9 is high, with many groups performing well. Among the methods registered as prediction “servers”, six independent groups are performing on average better than the rest. The submissions by “human” groups are dominated by meta‐predictors, with one group performing noticeably better than the others. Most of the participating groups failed to assign realistic confidence estimates to their predictions, and only a very small fraction of the assessed methods have provided highly accurate models and realistic error estimates at the same time. Also, the accuracy of predictions for homo‐oligomeric assemblies was overall poor, and only one group performed better than a naïve control predictor. Here, we present the results of our assessment of the CASP9 predictions in the category of template based modeling, documenting the state of the art and highlighting areas for future developments. Proteins 2011; © 2011 Wiley‐Liss, Inc.

[1]  J. Thornton,et al.  PROCHECK: a program to check the stereochemical quality of protein structures , 1993 .

[2]  N. Grishin,et al.  CASP9 target classification , 2011, Proteins.

[3]  J. Matthews,et al.  The power of two: protein dimerization in biology. , 2004, Trends in biochemical sciences.

[4]  Randy J. Read,et al.  Improved molecular replacement by density- and energy-guided protein structure optimization , 2011, Nature.

[5]  Narmada Thanki,et al.  CDD: a Conserved Domain Database for the functional annotation of proteins , 2010, Nucleic Acids Res..

[6]  Heidi J Imker,et al.  Discovery of a dipeptide epimerase enzymatic function guided by homology modeling and virtual screening. , 2008, Structure.

[7]  Timothy A. Whitehead,et al.  Computational Design of Proteins Targeting the Conserved Stem Region of Influenza Hemagglutinin , 2011, Science.

[8]  Marco Biasini,et al.  OpenStructure: a flexible software framework for computational structural biology , 2010, Bioinform..

[9]  Mindaugas Margelevicius,et al.  Detection of distant evolutionary relationships between protein families using theory of sequence profile-profile comparison , 2010, BMC Bioinformatics.

[10]  Roland L Dunbrack,et al.  Outcome of a workshop on applications of protein models in biomedical research. , 2009, Structure.

[11]  Qingguo Wang,et al.  MUFOLD‐WQA: A new selective consensus method for quality assessment in protein structure prediction , 2011, Proteins.

[12]  Qingguo Wang,et al.  A multilayer evaluation approach for protein structure prediction and model quality assessment , 2011, Proteins.

[13]  Randy J Read,et al.  Automated server predictions in CASP7 , 2007, Proteins.

[14]  Anna Tramontano,et al.  Assessment of homology‐based predictions in CASP5 , 2003, Proteins.

[15]  Krzysztof Fidelis,et al.  CASP8 results in context of previous experiments , 2009, Proteins.

[16]  A J Olson,et al.  Structural symmetry and protein function. , 2000, Annual review of biophysics and biomolecular structure.

[17]  Torsten Schwede,et al.  Assessment of ligand‐binding residue predictions in CASP9 , 2011, Proteins.

[18]  Burkhard Rost,et al.  Evaluation of template‐based models in CASP8 with standard measures , 2009, Proteins.

[19]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[20]  SödingJohannes Protein homology detection by HMM--HMM comparison , 2005 .

[21]  Keehyoung Joo,et al.  Improving physical realism, stereochemistry, and side‐chain accuracy in homology modeling: Four approaches that performed well in CASP8 , 2009, Proteins.

[22]  Iakes Ezkurdia,et al.  Target domain definition and classification in CASP8 , 2009, Proteins.

[23]  J. Janin,et al.  Analysis and prediction of protein quaternary structure. , 2010, Methods in molecular biology.

[24]  K. Henrick,et al.  Inference of macromolecular assemblies from crystalline state. , 2007, Journal of molecular biology.

[25]  Daniel B. Roche,et al.  Automated tertiary structure prediction with accurate local model quality assessment using the intfold‐ts method , 2011, Proteins.

[26]  Yang Zhang,et al.  Automated protein structure modeling in CASP9 by I‐TASSER pipeline combined with QUARK‐based ab initio folding and FG‐MD‐based structure refinement , 2011, Proteins.

[27]  Philip E. Bourne,et al.  Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction , 2006, BMC Bioinformatics.

[28]  Mark J van Raaij,et al.  Target highlights in CASP9: Experimental target structures for the critical assessment of techniques for protein structure prediction , 2011, Proteins.

[29]  Adam Godzik,et al.  The JCSG MR pipeline: optimized alignments, multiple models and parallel searches , 2007, Acta crystallographica. Section D, Biological crystallography.

[30]  C. Sander,et al.  Errors in protein structures , 1996, Nature.

[31]  Haruki Nakamura,et al.  The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data , 2006, Nucleic Acids Res..

[32]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[33]  Torsten Schwede,et al.  Assessment of CASP7 predictions for template‐based modeling targets , 2007, Proteins.

[34]  Krzysztof Fidelis,et al.  CASP9 results compared to those of previous casp experiments , 2011, Proteins.

[35]  Sean R. Eddy,et al.  Hidden Markov model speed heuristic and iterative HMM search procedure , 2010, BMC Bioinformatics.

[36]  Adrien Treuille,et al.  Predicting protein structures with a multiplayer online game , 2010, Nature.

[37]  Anna Tramontano,et al.  Evaluation of model quality predictions in CASP9 , 2011, Proteins.

[38]  Krzysztof Fidelis,et al.  Progress from CASP6 to CASP7 , 2007, Proteins.

[39]  Cyrus Chothia,et al.  SUPERFAMILY 1.75 including a domain-centric gene ontology method , 2010, Nucleic Acids Res..

[40]  Johannes Söding,et al.  Fast and accurate automatic structure prediction with HHpred , 2009, Proteins.

[41]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[42]  Randy J Read,et al.  Domain definition and target classification for CASP7 , 2007, Proteins.

[43]  Christopher J. Williams,et al.  The other 90% of the protein: Assessment beyond the Cαs for CASP8 template‐based and high‐accuracy models , 2009, Proteins.

[44]  G R Jacobson,et al.  Phosphoenolpyruvate:carbohydrate phosphotransferase systems of bacteria. , 1993, Microbiological reviews.

[45]  Jinbo Xu,et al.  Raptorx: Exploiting structure information for protein alignment by statistical inference , 2011, Proteins.

[46]  Christine A. Orengo,et al.  Gene3D: merging structure and function for a Thousand genomes , 2009, Nucleic Acids Res..

[47]  Jack Snoeyink,et al.  Nucleic Acids Research Advance Access published April 22, 2007 MolProbity: all-atom contacts and structure validation for proteins and nucleic acids , 2007 .

[48]  Jimin Pei,et al.  Analysis of CASP8 targets, predictions and assessment methods , 2009, Database J. Biol. Databases Curation.

[49]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.