TM-align: a protein structure alignment algorithm based on the TM-score

We have developed TM-align, a new algorithm to identify the best structural alignment between protein pairs that combines the TM-score rotation matrix and Dynamic Programming (DP). The algorithm is ∼4 times faster than CE and 20 times faster than DALI and SAL. On average, the resulting structure alignments have higher accuracy and coverage than those provided by these most often-used methods. TM-align is applied to an all-against-all structure comparison of 10 515 representative protein chains from the Protein Data Bank (PDB) with a sequence identity cutoff <95%: 1996 distinct folds are found when a TM-score threshold of 0.5 is used. We also use TM-align to match the models predicted by TASSER for solved non-homologous proteins in PDB. For both folded and misfolded models, TM-align can almost always find close structural analogs, with an average root mean square deviation, RMSD, of 3 Å and 87% alignment coverage. Nevertheless, there exists a significant correlation between the correctness of the predicted structure and the structural similarity of the model to the other proteins in the PDB. This correlation could be used to assist in model selection in blind protein structure predictions. The TM-align program is freely downloadable at .

[1]  C. Levinthal How to fold graciously , 1969 .

[2]  W. Kabsch A discussion of the solution for the best rotation to relate two sets of vectors , 1978 .

[3]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[4]  David I. Stuart,et al.  Crystal structure at 2.8 Å resolution of a soluble form of the cell adhesion molecule CD2 , 1992, Nature.

[5]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[6]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[7]  R. Lathrop The protein threading problem with sequence amino acid interaction preferences is NP-complete. , 1994, Protein engineering.

[8]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[9]  C. Sander,et al.  Dali: a network tool for protein structure comparison. , 1995, Trends in biochemical sciences.

[10]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[11]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[12]  M. Levitt,et al.  A unified statistical framework for sequence comparison and structure comparison. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[13]  John J. Barker,et al.  Engineering an intertwined form of CD2 for stability and assembly , 1998, Nature Structural Biology.

[14]  T J Hubbard RMS/Coverage graphs: A qualitative method for comparing three‐dimensional protein structure predictions , 1999, Proteins.

[15]  Jacquelyn S. Fetrow,et al.  Structural genomics and its importance for gene function analysis , 2000, Nature Biotechnology.

[16]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[17]  B Honig,et al.  An integrated approach to the analysis and modeling of protein sequences and structures. I. Protein structural alignment and a quantitative measure for protein structural distance. , 2000, Journal of molecular biology.

[18]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[19]  J Skolnick,et al.  Universal similarity measure for comparing protein structures. , 2001, Biopolymers.

[20]  J. Adelman,et al.  Structure of the gating domain of a Ca2+-activated K+ channel complexed with Ca2+/calmodulin , 2001, Nature.

[21]  Tim J. P. Hubbard,et al.  MaxBench: evaluation of sequence and structure comparison methods , 2002, Bioinform..

[22]  B. Fakler,et al.  A Helical Region in the C Terminus of Small-conductance Ca2+-activated K+ Channels Controls Assembly with Apo-calmodulin* , 2002, The Journal of Biological Chemistry.

[23]  Daisuke Kihara,et al.  Ab initio protein structure prediction on a genomic scale: Application to the Mycoplasma genitalium genome , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Patrick Aloy,et al.  Predictions without templates: New folds, secondary structure, and contacts in CASP5 , 2003, Proteins.

[25]  Jens Meiler,et al.  Rosetta predictions in CASP5: Successes, failures, and prospects for complete automation , 2003, Proteins.

[26]  J. Skolnick,et al.  The PDB is a covering set of small protein structures. , 2003, Journal of molecular biology.

[27]  B. Rost,et al.  Critical assessment of methods of protein structure prediction (CASP)—Round 6 , 2005, Proteins.

[28]  J. Skolnick,et al.  TOUCHSTONE II: a new approach to ab initio protein structure prediction. , 2003, Biophysical journal.

[29]  W. Pearson,et al.  Sensitivity and selectivity in protein structure comparison , 2004, Protein science : a publication of the Protein Society.

[30]  Gerard J Kleywegt,et al.  Evaluation of protein fold comparison servers , 2003, Proteins.

[31]  Narayanan Eswar,et al.  High-throughput computational and experimental techniques in structural genomics. , 2004, Genome research.

[32]  J. Skolnick,et al.  Automated structure prediction of weakly homologous proteins on a genomic scale. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Yang Zhang,et al.  Tertiary structure predictions on a comprehensive benchmark of medium to large size proteins. , 2004, Biophysical journal.

[34]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[35]  Yang Zhang,et al.  SPICKER: A clustering approach to identify near‐native protein folds , 2004, J. Comput. Chem..

[36]  Yang Zhang,et al.  Large-scale assessment of the utility of low-resolution protein structures for biochemical function assignment , 2004, Bioinform..

[37]  Rachel Kolodny,et al.  Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. , 2005, Journal of molecular biology.

[38]  Yang Zhang,et al.  The protein structure prediction problem could be solved using the current PDB library. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[39]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .