Augmenting SSEs with structural properties for rapid protein structure comparison

Comparing protein structures in three dimensions is a computationally expensive process that makes a full scan of a protein against a library of known protein structures impractical. To reduce the cost, we can use an approximation of the three dimensional structure that allows protein comparison to be performed quickly to filter away dissimilar proteins. In this paper we present a new algorithm, called SCALE, for protein structure comparison. In SCALE, a protein is represented as a sequence of secondary structure elements (SSEs) augmented with 3D structural properties such as the distances and angles between the SSEs. As such, the comparison between two proteins is reduced to a sequence alignment problem between their corresponding sequences of SSEs. The 3-D structural properties of the proteins contribute to the similarity score between the two sequences. We have implemented SCALE, and compared its performance against existing schemes. Our performance study shows that SCALE outperforms existing methods in terms of both efficiency and effectiveness (measured in terms of precision and recall).

[1]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[2]  Andrew J. Martin,et al.  The ups and downs of protein topology; rapid comparison of protein structure. , 2000, Protein engineering.

[3]  C. Chothia,et al.  Helix to helix packing in proteins. , 1981, Journal of molecular biology.

[4]  A. Godzik The structural alignment between two proteins: Is there a unique answer? , 1996, Protein science : a publication of the Protein Society.

[5]  Guoguang Lu,et al.  TOP: a new method for protein structure comparisons and similarity searches , 2000 .

[6]  William R. Taylor,et al.  Structure Comparison and Structure Patterns , 2000, J. Comput. Biol..

[7]  T. Ohkawa,et al.  A method of comparing protein structures based on matrix representation of secondary structure pairwise topology , 1999, Proceedings 1999 International Conference on Information Intelligence and Systems (Cat. No.PR00446).

[8]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[9]  D Walther,et al.  WebMol--a Java-based PDB viewer. , 1997, Trends in biochemical sciences.

[10]  Inge Jonassen,et al.  Protein structure comparison and struc-ture patterns-an algorithmic approach , 2001 .

[11]  Chris Sander,et al.  3-D Lookup: Fast Protein Structure Database Searches at 90% Reliability , 1995, ISMB.

[12]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[13]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[14]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[15]  Douglas L. Brutlag,et al.  Hierarchical Protein Structure Superposition Using Both Secondary Structure and Atomic Representations , 1997, ISMB.

[16]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[17]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[18]  James E. Bray,et al.  Assigning genomic sequences to CATH , 2000, Nucleic Acids Res..

[19]  Janet M. Thornton,et al.  Classifying a Protein Fold in the CATH Hierarchic Database , 1998 .

[20]  Amit Singh,et al.  Protein Structure Alignment: A Comparison of Methods , 2000 .

[21]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[23]  C. J. van Rijsbergen,et al.  Information Retrieval , 1979, Encyclopedia of GIS.

[24]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.