Evaluation of protein fold comparison servers

When a new protein structure has been determined, comparison with the database of known structures enables classification of its fold as new or belonging to a known class of proteins. This in turn may provide clues about the function of the protein. A large number of fold comparison programs have been developed, but they have never been subjected to a comprehensive and critical comparative analysis. Here we describe an evaluation of 11 publicly available, Web‐based servers for automatic fold comparison. Both their functionality (e.g., user interface, presentation, and annotation of results) and their performance (i.e., how well established structural similarities are recognized) were assessed. The servers were subjected to a battery of performance tests covering a broad spectrum of folds as well as special cases, such as multidomain proteins, Cα‐only models, new folds, and NMR‐based models. The CATH structural classification system was used as a reference. These tests revealed the strong and weak sides of each server. On the whole, CE, DALI, MATRAS, and VAST showed the best performance, but none of the servers achieved a 100% success rate. Where no structurally similar proteins are found by any individual server, it is recommended to try one or two other servers before any conclusions concerning the novelty of a fold are put on paper. Proteins 2004. © 2003 Wiley‐Liss, Inc.

[1]  Douglas L. Brutlag,et al.  Hierarchical Protein Structure Superposition Using Both Secondary Structure and Atomic Representations , 1997, ISMB.

[2]  Stephen K. Burley,et al.  Crystal structures of ribosome anti-association factor IF6 , 2000, Nature Structural Biology.

[3]  Wolfgang Knecht,et al.  Structure of the Bacillus subtilis D-aminopeptidase DppA reveals a novel self-compartmentalizing protease , 2001, Nature Structural Biology.

[4]  Frances M. G. Pearl,et al.  Quantifying the similarities within fold space. , 2002, Journal of molecular biology.

[5]  William R. Taylor,et al.  Structure Comparison and Structure Patterns , 2000, J. Comput. Biol..

[6]  Jane A. Endicott,et al.  Structure-based design of a potent purine-based cyclin-dependent kinase inhibitor , 2002, Nature Structural Biology.

[7]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[8]  P. Koehl,et al.  Protein structure similarities. , 2001, Current opinion in structural biology.

[9]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[10]  M. Levitt,et al.  Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core , 1993, Current Biology.

[11]  M. Levitt,et al.  A unified statistical framework for sequence comparison and structure comparison. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[12]  S. Pongor,et al.  Protein fold similarity estimated by a probabilistic approach based on Cα-Cα distance comparison , 2002 .

[13]  L Holm,et al.  Alignment of three-dimensional protein structures: network server for database searching. , 1996, Methods in enzymology.

[14]  Stephen K. Burley,et al.  Response to Paoli , 2001, Nature Structural Biology.

[15]  G. Kleywegt Use of non-crystallographic symmetry in protein structure refinement. , 1996, Acta crystallographica. Section D, Biological crystallography.

[16]  William R. Taylor,et al.  A Protein Structure Comparison Methodology , 1996, Comput. Chem..

[17]  Oliviero Carugo,et al.  The PRIDE server for protein three-dimensional similarity , 2002 .

[18]  W R Taylor,et al.  SSAP: sequential structure alignment program for protein structure comparison. , 1996, Methods in enzymology.

[19]  S. Bryant,et al.  Threading a database of protein cores , 1995, Proteins.

[20]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[21]  Liam J. McGuffin,et al.  What are the baselines for protein fold recognition? , 2001, Bioinform..

[22]  C. Sander,et al.  The FSSP database of structurally aligned protein fold families. , 1994, Nucleic acids research.

[23]  C. Sander,et al.  Searching protein structure databases has come of age , 1994, Proteins.

[24]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[25]  G J Kleywegt,et al.  Binding site differences revealed by crystal structures of Plasmodium falciparum and bovine acyl-CoA binding protein. , 2001, Journal of molecular biology.

[26]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[27]  Guoguang Lu,et al.  TOP: a new method for protein structure comparisons and similarity searches , 2000 .

[28]  G. Kleywegt,et al.  Interactive motif and fold recognition in protein structures , 2002 .

[29]  A Elofsson,et al.  Assessing the performance of fold recognition methods by means of a comprehensive benchmark. , 1996, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[30]  Massimo Paoli,et al.  An elusive propeller-like fold , 2001, Nature Structural Biology.

[31]  J. Jung,et al.  Protein structure alignment using environmental profiles. , 2000, Protein engineering.

[32]  J M Thornton,et al.  An atlas of protein topology cartoons available on the World-Wide Web. , 1998, Trends in biochemical sciences.

[33]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[34]  D T Jones,et al.  A systematic comparison of protein structure classifications: SCOP, CATH and FSSP. , 1999, Structure.

[35]  Andrew J. Martin,et al.  The ups and downs of protein topology; rapid comparison of protein structure. , 2000, Protein engineering.

[36]  David R. Gilbert,et al.  Motif-based searching in TOPS protein topology databases , 1999, Bioinform..

[37]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[38]  K. Nishikawa,et al.  Protein structure comparison using the Markov transition model of evolution , 2000, Proteins.

[39]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[40]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[41]  James E. Bray,et al.  Assigning genomic sequences to CATH , 2000, Nucleic Acids Res..

[42]  G. Kleywegt,et al.  Halloween ... Masks and Bones , 1994 .

[43]  G. Kleywegt,et al.  Detecting folding motifs and similarities in protein structures. , 1997, Methods in enzymology.

[44]  C. Chothia,et al.  Understanding protein structure: using scop for fold interpretation. , 1996, Methods in enzymology.

[45]  David R. Gilbert,et al.  A Computer System to Perform Structure Comparison using Representations of Protein Structure , 2002, Comput. Chem..

[46]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[47]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.