Assessing the accuracy of template-based structure prediction metaservers by comparison with structural genomics structures

The explosion of the size of the universe of known protein sequences has stimulated two complementary approaches to structural mapping of these sequences: theoretical structure prediction and experimental determination by structural genomics (SG). In this work, we assess the accuracy of structure prediction by two automated template-based structure prediction metaservers (genesilico.pl and bioinfo.pl) by measuring the structural similarity of the predicted models to corresponding experimental models determined a posteriori. Of 199 targets chosen from SG programs, the metaservers predicted the structures of about a fourth of them “correctly.” (In this case, “correct” was defined as placing more than 70 % of the alpha carbon atoms in the model within 2 Å of the experimentally determined positions.) Almost all of the targets that could be modeled to this accuracy were those with an available template in the Protein Data Bank (PDB) with more than 25 % sequence identity. The majority of those SG targets with lower sequence identity to structures in the PDB were not predicted by the metaservers with this accuracy. We also compared metaserver results to CASP8 results, finding that the models obtained by participants in the CASP competition were significantly better than those produced by the metaservers.

[1]  Arne Elofsson,et al.  3D-Jury: A Simple Approach to Improve Protein Structure Predictions , 2003, Bioinform..

[2]  John D. Westbrook,et al.  TargetDB: a target registration database for structural genomics projects , 2004, Bioinform..

[3]  Česlovas Venclovas,et al.  The use of automatic tools and human expertise in template‐based modeling of CASP8 target proteins , 2009, Proteins.

[4]  Janusz M. Bujnicki,et al.  GeneSilico protein structure prediction meta-server , 2003, Nucleic Acids Res..

[5]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[6]  Daniel W. Kulp,et al.  Generalized Fragment Picking in Rosetta: Design, Protocols and Applications , 2011, PloS one.

[7]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP) — round x , 2014, Proteins.

[8]  A. Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP)—round IX , 2011, Proteins.

[9]  Chris Sander,et al.  Completeness in structural genomics , 2001, Nature Structural Biology.

[10]  David Baker,et al.  Protein structure prediction and analysis using the Robetta server , 2004, Nucleic Acids Res..

[11]  Zbyszek Otwinowski,et al.  Structural genomics: keeping up with expanding knowledge of the protein universe. , 2007, Current opinion in structural biology.

[12]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction—Round VII , 2007, Proteins.

[13]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[14]  Ilya N. Shindyalov,et al.  PDP: protein domain parser , 2003, Bioinform..

[15]  Arne Elofsson,et al.  Structure prediction meta server , 2001, Bioinform..

[16]  Michael Levitt,et al.  Growth of novel protein structural data , 2007, Proceedings of the National Academy of Sciences.

[17]  Dominik Gront,et al.  BioShell - a package of tools for structural biology computations , 2006, Bioinform..

[18]  Yang Zhang,et al.  Ab initio protein structure assembly using continuous structure fragments and optimized knowledge‐based force field , 2012, Proteins.

[19]  Johannes Söding,et al.  Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[20]  Dominik Gront,et al.  Comparative modeling without implicit sequence alignments , 2007, Bioinform..

[21]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[22]  M. Levitt Nature of the protein universe , 2009, Proceedings of the National Academy of Sciences.

[23]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[24]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[25]  John Moult,et al.  A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. , 2005, Current opinion in structural biology.

[26]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[27]  A. Kolinski Protein modeling and structure prediction with a reduced representation. , 2004, Acta biochimica Polonica.

[28]  Yang Zhang,et al.  I-TASSER server for protein 3D structure prediction , 2008, BMC Bioinformatics.

[29]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[30]  Krzysztof Fidelis,et al.  CASP9 results compared to those of previous casp experiments , 2011, Proteins.

[31]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[32]  Dominik Gront,et al.  Utility library for structural bioinformatics , 2008, Bioinform..

[33]  Arne Elofsson,et al.  Pcons5: combining consensus, structural evaluation and fold recognition scores , 2005, Bioinform..

[34]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[35]  B. Rost,et al.  Critical assessment of methods of protein structure prediction—Round VIII , 2009, Proteins.