Improvement of 3D protein models using multiple templates guided by single-template model quality assessment

MOTIVATION Modelling the 3D structures of proteins can often be enhanced if more than one fold template is used during the modelling process. However, in many cases, this may also result in poorer model quality for a given target or alignment method. There is a need for modelling protocols that can both consistently and significantly improve 3D models and provide an indication of when models might not benefit from the use of multiple target-template alignments. Here, we investigate the use of both global and local model quality prediction scores produced by ModFOLDclust2, to improve the selection of target-template alignments for the construction of multiple-template models. Additionally, we evaluate clustering the resulting population of multi- and single-template models for the improvement of our IntFOLD-TS tertiary structure prediction method. RESULTS We find that using accurate local model quality scores to guide alignment selection is the most consistent way to significantly improve models for each of the sequence to structure alignment methods tested. In addition, using accurate global model quality for re-ranking alignments, prior to selection, further improves the majority of multi-template modelling methods tested. Furthermore, subsequent clustering of the resulting population of multiple-template models significantly improves the quality of selected models compared with the previous version of our tertiary structure prediction method, IntFOLD-TS. AVAILABILITY AND IMPLEMENTATION Source code and binaries can be freely downloaded from http://www.reading.ac.uk/bioinf/downloads/

[1]  Arne Elofsson,et al.  MaxSub: an automated measure for the assessment of protein structure prediction quality , 2000, Bioinform..

[2]  Johannes Söding,et al.  Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[3]  Yang Zhang,et al.  I-TASSER: a unified platform for automated protein structure and function prediction , 2010, Nature Protocols.

[4]  Michael J. E. Sternberg,et al.  Sequencing delivers diminishing returns for homology detection: implications for mapping the protein universe , 2010, Bioinform..

[5]  J. Skolnick,et al.  Erratum: Scoring function for automated assessment of protein structure template quality (Proteins: Structure, Function and Genetics (2004) 57, (702-710)) , 2007 .

[6]  Torsten Schwede,et al.  Assessment of template based protein structure predictions in CASP9 , 2011, Proteins.

[7]  Paul W. Fitzjohn,et al.  In silico protein recombination: enhancing template and sequence alignment selection for comparative protein modelling. , 2003, Journal of molecular biology.

[8]  Ram Samudrala,et al.  Improving the accuracy of template-based predictions by mixing and matching between initial models , 2008, BMC Structural Biology.

[9]  J Lundström,et al.  Pcons: A neural‐network–based consensus predictor that improves fold recognition , 2001, Protein science : a publication of the Protein Society.

[10]  A. Sali,et al.  Modeller: generation and refinement of homology-based protein structure models. , 2003, Methods in enzymology.

[11]  Yaoqi Zhou,et al.  SPARKS 2 and SP3 servers in CASP6 , 2005, Proteins.

[12]  Marcin J. Skwark,et al.  Improved predictions by Pcons.net using multiple templates , 2011, Bioinform..

[13]  Liam J. McGuffin,et al.  Model Quality Prediction , 2010 .

[14]  Johannes Söding,et al.  Fast and accurate automatic structure prediction with HHpred , 2009, Proteins.

[15]  Mohammed J. Zaki,et al.  Protein Structure Prediction , 2008, Methods in Molecular Biology™.

[16]  Yang Zhang,et al.  I-TASSER server for protein 3D structure prediction , 2008, BMC Bioinformatics.

[17]  Arne Elofsson,et al.  Pcons.net: protein structure prediction meta server , 2007, Nucleic Acids Res..

[18]  Krzysztof Fidelis,et al.  Processing and evaluation of predictions in CASP4 , 2001, Proteins.

[19]  Daniel Fischer,et al.  3D‐SHOTGUN: A novel, cooperative, fold‐recognition meta‐predictor , 2003, Proteins.

[20]  A. Elofsson,et al.  Can correct protein models be identified? , 2003, Protein science : a publication of the Protein Society.

[21]  Liam J. McGuffin,et al.  The IntFOLD server: an integrated web resource for protein fold recognition, 3D model quality assessment, intrinsic disorder prediction, domain prediction and ligand binding site prediction , 2011, Nucleic Acids Res..

[22]  Sitao Wu,et al.  LOMETS: A local meta-threading-server for protein structure prediction , 2007, Nucleic acids research.

[23]  José Mario Martínez,et al.  Convergent algorithms for protein structural alignment , 2007, BMC Bioinformatics.

[24]  Liam J. McGuffin,et al.  Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments , 2010, Bioinform..

[25]  Daniel B. Roche,et al.  Automated tertiary structure prediction with accurate local model quality assessment using the intfold‐ts method , 2011, Proteins.

[26]  Jinbo Xu,et al.  A multiple‐template approach to protein threading , 2011, Proteins.

[27]  Jinbo Xu,et al.  Raptorx: Exploiting structure information for protein alignment by statistical inference , 2011, Proteins.

[28]  Arne Elofsson,et al.  3D-Jury: A Simple Approach to Improve Protein Structure Predictions , 2003, Bioinform..

[29]  Jianlin Cheng,et al.  MULTICOM: a multi-level combination approach to protein structure prediction and its assessments in CASP8 , 2010, Bioinform..

[30]  Jianlin Cheng A multi-template combination algorithm for protein comparative modeling , 2008, BMC Structural Biology.

[31]  Anna Tramontano,et al.  Evaluation of model quality predictions in CASP9 , 2011, Proteins.

[32]  Johannes Söding,et al.  The HHpred interactive server for protein homology detection and structure prediction , 2005, Nucleic Acids Res..

[33]  Anna Tramontano,et al.  Evaluation of protein structure prediction methods: Issues and strategies , 2011 .

[34]  Jianlin Cheng,et al.  APOLLO: a quality assessment service for single and multiple protein models , 2011, Bioinform..

[35]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[36]  Arne Elofsson,et al.  Using multiple templates to improve quality of homology models in automated homology modeling , 2008, Protein science : a publication of the Protein Society.

[37]  A. Giuliani,et al.  A computational approach identifies two regions of Hepatitis C Virus E1 protein as interacting domains involved in viral fusion process , 2009, BMC Structural Biology.

[38]  Yang Zhang,et al.  How significant is a protein structure similarity with TM-score = 0.5? , 2010, Bioinform..