Comparative protein structure modeling by combining multiple templates and optimizing sequence-to-structure alignments

MOTIVATION Two major bottlenecks in advancing comparative protein structure modeling are the efficient combination of multiple template structures and the generation of a correct input target-template alignment. RESULTS A novel method, Multiple Mapping Method with Multiple Templates (M4T) is introduced that implements an algorithm to automatically select and combine Multiple Template structures (MT) and an alignment optimization protocol (Multiple Mapping Method, MMM). The MT module of M4T selects and combines multiple template structures through an iterative clustering approach that takes into account the 'unique' contribution of each template, their sequence similarity among themselves and to the target sequence, and their experimental resolution. MMM is a sequence-to-structure alignment method that optimally combines alternatively aligned regions according to their fit in the structural environment of the template structure. The resulting M4T alignment is used as input to a comparative modeling module. The performance of M4T has been benchmarked on CASP6 comparative modeling target sequences and on a larger independent test set, and showed favorable performance to current state of the art methods.

[1]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[2]  R Sánchez,et al.  Evaluation of comparative protein structure modeling by MODELLER‐3 , 1997, Proteins.

[3]  Christophe G. Lambert,et al.  ESyPred3D: Prediction of proteins 3D structures , 2002, Bioinform..

[4]  A. Sali,et al.  Modeling of loops in protein structures , 2000, Protein science : a publication of the Protein Society.

[5]  Robert C. Edgar,et al.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity , 2004, BMC Bioinformatics.

[6]  Sandor Vajda,et al.  Consensus alignment for reliable framework prediction in homology modeling , 2003, Bioinform..

[7]  Adam Godzik,et al.  Saturated BLAST: an automated multiple intermediate sequence search used to detect distant homology , 2000, Bioinform..

[8]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[9]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[10]  András Fiser,et al.  Molecular Biophysics , 2022 .

[11]  T L Blundell,et al.  FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. , 2001, Journal of molecular biology.

[12]  Narayanan Eswar,et al.  High-throughput computational and experimental techniques in structural genomics. , 2004, Genome research.

[13]  A. Fiser Protein structure modeling in the proteomics era , 2004, Expert review of proteomics.

[14]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[15]  Marcin Feder,et al.  FRankenstein becomes a cyborg: The automatic recombination and realignment of fold recognition models in CASP6 , 2005, Proteins.

[16]  W A Koppensteiner,et al.  Sustained performance of knowledge‐based potentials in fold recognition , 1999, Proteins.

[17]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[18]  Kimmen Sjölander,et al.  SATCHMO: Sequence Alignment and Tree Construction Using Hidden Markov Models , 2003, Bioinform..

[19]  András Fiser,et al.  MMM: a sequence-to-structure alignment protocol , 2006, Bioinform..

[20]  C. Lambert,et al.  ESyPred 3 D : Prediction of proteins 3 D structures , 2002 .

[21]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[22]  Liam J. McGuffin,et al.  The PSIPRED protein structure prediction server , 2000, Bioinform..

[23]  András Fiser,et al.  M4T: a comparative protein structure modeling server , 2007, Nucleic Acids Res..

[24]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[25]  A. Sali,et al.  Modeller: generation and refinement of homology-based protein structure models. , 2003, Methods in enzymology.

[26]  Adam Godzik,et al.  Tolerating some redundancy significantly speeds up clustering of large protein databases , 2002, Bioinform..

[27]  Andrej Sali,et al.  Variable gap penalty for protein sequence-structure alignment. , 2006, Protein engineering, design & selection : PEDS.

[28]  Gary Hardiman,et al.  Introduction to proteomics: tools for the new biology , 2004 .

[29]  Baldomero Oliva,et al.  A supersecondary structure library and search algorithm for modeling loops in protein structures , 2006, Nucleic acids research.

[30]  Č. Venclovas,et al.  Comparative modeling in CASP6 using consensus approach to template selection, sequence‐structure alignment, and structure assessment , 2005, Proteins.

[31]  András Fiser,et al.  Multiple mapping method: A novel approach to the sequence‐to‐structure alignment problem in comparative protein structure modeling , 2006, Proteins.

[32]  Leszek Rychlewski,et al.  Improving the quality of twilight‐zone alignments , 2000, Protein science : a publication of the Protein Society.