A multiple‐template approach to protein threading

Most threading methods predict the structure of a protein using only a single template. Due to the increasing number of solved structures, a protein without solved structure is very likely to have more than one similar template structures. Therefore, a natural question to ask is if we can improve modeling accuracy using multiple templates. This article describes a new multiple‐template threading method to answer this question. At the heart of this multiple‐template threading method is a novel probabilistic‐consistency algorithm that can accurately align a single protein sequence simultaneously to multiple templates. Experimental results indicate that our multiple‐template method can improve pairwise sequence‐template alignment accuracy and generate models with better quality than single‐template models even if they are built from the best single templates (P‐value <10−6) while many popular multiple sequence/structure alignment tools fail to do so. The underlying reason is that our probabilistic‐consistency algorithm can generate accurate multiple sequence/template alignments. In another word, without an accurate multiple sequence/template alignment, the modeling accuracy cannot be improved by simply using multiple templates to increase alignment coverage. Blindly tested on the CASP9 targets with more than one good template structures, our method outperforms all other CASP9 servers except two (Zhang‐Server and QUARK of the same group). Our probabilistic‐consistency algorithm can possibly be extended to align multiple protein/RNA sequences and structures. Proteins 2011; © 2011 Wiley‐Liss, Inc.

[1]  T L Blundell,et al.  FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. , 2001, Journal of molecular biology.

[2]  David Baker,et al.  Macromolecular modeling with rosetta. , 2008, Annual review of biochemistry.

[3]  D Gusfield,et al.  Efficient methods for multiple sequence alignment with guaranteed error bounds , 1993, Bulletin of mathematical biology.

[4]  Oliver F. Lange,et al.  Structure prediction for CASP8 with all‐atom refinement using Rosetta , 2009, Proteins.

[5]  R. Doolittle,et al.  Progressive sequence alignment as a prerequisitetto correct phylogenetic trees , 2007, Journal of Molecular Evolution.

[6]  Sitao Wu,et al.  MUSTER: Improving protein sequence profile–profile alignments by using multiple sources of structure information , 2008, Proteins.

[7]  Yang Zhang,et al.  I‐TASSER: Fully automated protein structure prediction in CASP8 , 2009, Proteins.

[8]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[9]  Jian Peng,et al.  Boosting Protein Threading Accuracy , 2009, RECOMB.

[10]  SödingJohannes Protein homology detection by HMM--HMM comparison , 2005 .

[11]  András Fiser,et al.  Improved scoring function for comparative modeling using the M4T method , 2009, Journal of Structural and Functional Genomics.

[12]  Jian Peng,et al.  Low-homology protein threading , 2010, Bioinform..

[13]  András Fiser,et al.  M4T: a comparative protein structure modeling server , 2007, Nucleic Acids Res..

[14]  Ming Li,et al.  Assessment of RAPTOR's linear programming approach in CAFASP3 , 2003, Proteins.

[15]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[16]  K. Katoh,et al.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.

[17]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[18]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[19]  N. Grishin,et al.  PROMALS3D: a tool for multiple protein sequence and structure alignments , 2008, Nucleic acids research.

[20]  Jianlin Cheng A multi-template combination algorithm for protein comparative modeling , 2008, BMC Structural Biology.

[21]  Yaoqi Zhou,et al.  SPARKS 2 and SP3 servers in CASP6 , 2005, Proteins.

[22]  Iain M. Wallace,et al.  M-Coffee: combining multiple sequence alignment methods with T-Coffee , 2006, Nucleic acids research.

[23]  Keehyoung Joo,et al.  High accuracy template based modeling by global optimization , 2007, Proteins.

[24]  Jeffrey Skolnick,et al.  Protein structure prediction by pro-Sp3-TASSER. , 2009, Biophysical journal.

[25]  Peter F. Stadler,et al.  Stochastic pairwise alignments , 2002, ECCB.

[26]  Cedric Notredame,et al.  Computing Multiple Sequence/Structure Alignments with the T‐Coffee Package , 2003, Current protocols in bioinformatics.

[27]  Yang Zhang,et al.  I-TASSER server for protein 3D structure prediction , 2008, BMC Bioinformatics.

[28]  Johannes Söding,et al.  Fast and accurate automatic structure prediction with HHpred , 2009, Proteins.

[29]  O. Gotoh Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. , 1996, Journal of molecular biology.

[30]  Arne Elofsson,et al.  Using multiple templates to improve quality of homology models in automated homology modeling , 2008, Protein science : a publication of the Protein Society.

[31]  M. Sternberg,et al.  Protein structure prediction on the Web: a case study using the Phyre server , 2009, Nature Protocols.

[32]  A Sali,et al.  Comparative protein modeling by satisfaction of spatial restraints. , 1996, Molecular medicine today.

[33]  Chuong B. Do,et al.  ProbCons: Probabilistic consistency-based multiple sequence alignment. , 2005, Genome research.

[34]  Lenore Cowen,et al.  Matt: Local Flexibility Aids Protein Multiple Structure Alignment , 2008, PLoS Comput. Biol..

[35]  Ying Xu,et al.  Raptor: Optimal Protein Threading by Linear Programming , 2003, J. Bioinform. Comput. Biol..

[36]  Andrej ⩽ali,et al.  Comparative protein modeling by satisfaction of spatial restraints , 1995 .