Scoring function for automated assessment of protein structure template quality

We have developed a new scoring function, the template modeling score (TM‐score), to assess the quality of protein structure templates and predicted full‐length models by extending the approaches used in Global Distance Test (GDT) 1 and MaxSub. 2 First, a protein size‐dependent scale is exploited to eliminate the inherent protein size dependence of the previous scores and appropriately account for random protein structure pairs. Second, rather than setting specific distance cutoffs and calculating only the fractions with errors below the cutoff, all residue pairs in alignment/modeling are evaluated in the proposed score. For comparison of various scoring functions, we have constructed a large‐scale benchmark set of structure templates for 1489 small to medium size proteins using the threading program PROSPECTOR_3 and built the full‐length models using MODELLER and TASSER. The TM‐score of the initial threading alignments, compared to the GDT and MaxSub scoring functions, shows a much stronger correlation to the quality of the final full‐length models. The TM‐score is further exploited as an assessment of all ‘new fold’ targets in the recent CASP5 experiment and shows a close coincidence with the results of human‐expert visual assessment. These data suggest that the TM‐score is a useful complement to the fully automated assessment of protein structure predictions. The executable program of TM‐score is freely downloadable at http://bioinformatics.buffalo.edu/TM‐score. Proteins 2004. © 2004 Wiley‐Liss, Inc.

[1]  B. Rost,et al.  Critical assessment of methods of protein structure prediction (CASP)—Round 6 , 2005, Proteins.

[2]  W R Taylor,et al.  Homology modelling by distance geometry. , 1996, Folding & design.

[3]  Osvaldo Olmea,et al.  MAMMOTH (Matching molecular models obtained from theory): An automated method for model comparison , 2002, Protein science : a publication of the Protein Society.

[4]  R. Perham,et al.  Prediction of the three‐dimensional structures of the biotinylated domain from yeast pyruvate carboxylase and of the lipoylated H‐protein from the pea leaf glycine cleavage system: A new automated method for the prediction of protein tertiary structure , 1993, Protein science : a publication of the Protein Society.

[5]  T L Blundell,et al.  An evaluation of the performance of an automated procedure for comparative modelling of protein tertiary structure. , 1993, Protein engineering.

[6]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[7]  M. Levitt,et al.  A unified statistical framework for sequence comparison and structure comparison. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Arne Elofsson,et al.  MaxSub: an automated measure for the assessment of protein structure prediction quality , 2000, Bioinform..

[9]  D Fischer,et al.  LiveBench‐2: Large‐scale automated evaluation of protein structure prediction servers , 2001, Proteins.

[10]  J. Skolnick,et al.  Development and large scale benchmark testing of the PROSPECTOR_3 threading algorithm , 2004, Proteins.

[11]  J. Skolnick,et al.  Automated structure prediction of weakly homologous proteins on a genomic scale. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[12]  S. Pongor,et al.  A normalized root‐mean‐spuare distance for comparing protein three‐dimensional structures , 2001, Protein science : a publication of the Protein Society.

[13]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[14]  Patrick Aloy,et al.  Predictions without templates: New folds, secondary structure, and contacts in CASP5 , 2003, Proteins.

[15]  C. Sander,et al.  Dali: a network tool for protein structure comparison. , 1995, Trends in biochemical sciences.

[16]  T. Blundell,et al.  Comparative protein modelling by satisfaction of spatial restraints. , 1993, Journal of molecular biology.

[17]  Timothy F. Havel,et al.  A new method for building protein conformations from sequence alignments with homologues of known structure. , 1991, Journal of molecular biology.

[18]  J. Skolnick,et al.  The PDB is a covering set of small protein structures. , 2003, Journal of molecular biology.

[19]  Arne Elofsson,et al.  A study of quality measures for protein threading models , 2001, BMC Bioinformatics.

[20]  J. Skolnick,et al.  Assembly of protein structure from sparse experimental data: An efficient Monte Carlo model , 1998, Proteins.

[21]  T. Hubbard,et al.  Critical assessment of methods of protein structure prediction (CASP): Round III , 1999, Proteins.

[22]  Anna Tramontano,et al.  Assessment of homology‐based predictions in CASP5 , 2003, Proteins.

[23]  W. Kabsch A discussion of the solution for the best rotation to relate two sets of vectors , 1978 .

[24]  Mark Gerstein,et al.  Using Iterative Dynamic Programming to Obtain Accurate Pairwise and Multiple Alignments of Protein Structures , 1996, ISMB.

[25]  J. Skolnick,et al.  TOUCHSTONE II: a new approach to ab initio protein structure prediction. , 2003, Biophysical journal.

[26]  D. Eisenberg,et al.  A method to identify protein sequences that fold into a known three-dimensional structure. , 1991, Science.

[27]  S. Bryant,et al.  Statistics of sequence-structure threading. , 1995, Current opinion in structural biology.

[28]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[29]  Adam Zemla,et al.  Critical assessment of methods of protein structure prediction (CASP)‐round V , 2005, Proteins.

[30]  J. Skolnick,et al.  What is the probability of a chance prediction of a protein structure with an rmsd of 6 A? , 1998, Folding & design.

[31]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[32]  C Venclovas,et al.  Processing and analysis of CASP3 protein structure predictions , 1999, Proteins.

[33]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[34]  J Skolnick,et al.  Universal similarity measure for comparing protein structures. , 2001, Biopolymers.

[35]  Lisa N Kinch,et al.  CASP5 assessment of fold recognition target predictions , 2003, Proteins.

[36]  A. Sali,et al.  Modeling of loops in protein structures , 2000, Protein science : a publication of the Protein Society.