The Structure Superposition Database

The need for new tools for investigating biological systems on a large scale is becoming acute, particularly with respect to computationally intensive analyses such as comparisons of many three-dimensional protein structures. Structure superposition is a valuable approach for understanding evolutionary relationships and for the prediction of function. But while available tools are adequate for generating and viewing superpositions of single pairs of protein structures, these tools are generally too cumbersome and time-consuming for examining multiple superpositions. To address this need, we have created the Structure Superposition Database (SSD) for accessing, viewing and understanding large sets of structure superposition data. The initial implementation of the SSD contains the results of pairwise, all-by-all superpositions of a representative set of 115 (beta/alpha)8 barrel structures (TIM barrels). Future plans call for extending the database to include representative structure superpositions for many additional folds. The SSD can be browsed with a user interface module developed as an extension to Chimera, an extensible molecular modeling program. Features of the user interface module facilitate viewing multiple superpositions together. The SSD interface module can be downloaded from http://ssd.rbvi.ucsf.edu.

[1]  Liisa Holm,et al.  Identification of homology in protein structure classification , 2001, Nature Structural Biology.

[2]  G. Petsko,et al.  The evolution of alpha/beta barrel enzymes. , 1990, Trends in biochemical sciences.

[3]  G. H. Reed,et al.  The enolase superfamily: a general strategy for enzyme-catalyzed abstraction of the alpha-protons of carboxylic acids. , 1996, Biochemistry.

[4]  M. Gerstein Patterns of protein‐fold usage in eight microbial genomes: A comprehensive structural census , 1998, Proteins.

[5]  R. Wierenga,et al.  The TIM‐barrel fold: a versatile framework for efficient enzymes , 2001, FEBS letters.

[6]  A. Sali,et al.  Structural genomics: beyond the Human Genome Project , 1999, Nature Genetics.

[7]  M. Levitt,et al.  A unified statistical framework for sequence comparison and structure comparison. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Yanli Wang,et al.  MMDB: Entrez's 3D-structure database , 2003, Nucleic Acids Res..

[9]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[10]  Philip E. Bourne,et al.  A database and tools for 3-D protein structure comparison and alignment using the Combinatorial Extension (CE) algorithm , 2001, Nucleic Acids Res..

[11]  Chris Sander,et al.  Touring protein fold space with Dali/FSSP , 1998, Nucleic Acids Res..

[12]  C C Huang,et al.  Integrated tools for structural and sequence alignment and analysis. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[13]  M Gerstein,et al.  A structural census of genomes: comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure. , 1997, Journal of molecular biology.

[14]  Conrad C. Huang,et al.  MINRMS: an efficient algorithm for determining protein structure similarity using root-mean-squared-distance , 2003, Bioinform..

[15]  I D Kuntz,et al.  A rapid method for exploring the protein structure universe , 1999, Proteins.

[16]  T. N. Bhat,et al.  The Protein Data Bank: unifying the archive , 2002, Nucleic Acids Res..

[17]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[18]  Gregory A. Petsko,et al.  The evolution of a/ barrel enzymes , 1990 .

[19]  P. Babbitt,et al.  Divergent evolution of enzymatic function: mechanistically diverse superfamilies and functionally distinct suprafamilies. , 2001, Annual review of biochemistry.

[20]  Tim J. P. Hubbard,et al.  SCOP database in 2002: refinements accommodate structural genomics , 2002, Nucleic Acids Res..

[21]  M. Vidal,et al.  Structural genomics: A pipeline for providing structures for the biologist , 2002, Protein science : a publication of the Protein Society.