High dimensional indexing for protein structure matching using bowties

For determining functionality dependencies between two proteins, both represented as 3D structures, it is an essential condition that they have a matching structure. As 3D structures for proteins are large, complex and constantly evolving, it is very time-consuming to identify possible locations and sizes of such a matching structure for a given protein against a large protein database. In this paper, we introduce a novel representation model and apply a transformation and formalization to this problem. We then propose a database solution by using innovative high dimensional indexing mechanisms. Experimental results demonstrate a promising performance of the high dimensional indexing to this biologically critical but previously computationally prohibitive problem.

[1]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[2]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[3]  C. Sander,et al.  Searching protein structure databases has come of age , 1994, Proteins.

[4]  R. Wade Peptides: Chemistry, structure and biology , 1977 .

[5]  A. Mirsky,et al.  On the Structure of Native, Denatured, and Coagulated Proteins. , 1936, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[7]  D. Mount Bioinformatics: Sequence and Genome Analysis , 2001 .

[8]  D Fischer,et al.  Analysis of topological and nontopological structural similarities in the PDB: New examples with old structures , 1996, Proteins.

[9]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[10]  T. Blundell,et al.  Catching a common fold , 1993, Protein science : a publication of the Protein Society.

[11]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[12]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[13]  N. O. Manning,et al.  The protein data bank , 1999, Genetica.

[14]  W R Taylor,et al.  SSAP: sequential structure alignment program for protein structure comparison. , 1996, Methods in enzymology.