PROuST: A Comparison Method of Three-Dimensional Structures of Proteins Using Indexing Techniques

We present a new method for protein structure comparison that combines indexing and dynamic programming (DP). The method is based on simple geometric features of triplets of secondary structures of proteins. These features provide indexes to a hash table that allows fast retrieval of similarity information for a query protein. After the query protein is matched with all proteins in the hash table producing a list of putative similarities, the dynamic programming algorithm is used to align the query protein with each protein of this list. Since the pairwise comparison with DP is applied only to a small subset of proteins and, furthermore, DP re-uses information that is already computed and stored in the hash table, the approach is very fast even when searching the entire PDB. We have done extensive experimentation showing that our approach achieves results of quality comparable to that of other existing approaches but is generally faster.

[1]  Tatsuya Akutsu,et al.  Protein Structure Alignment Using Dynamic Programing and Iterative Improvement , 1996 .

[2]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[3]  William R. Taylor,et al.  A Protein Structure Comparison Methodology , 1996, Comput. Chem..

[4]  P Willett,et al.  Identification of tertiary structure resemblance in proteins using a maximal common subgraph isomorphism algorithm. , 1993, Journal of molecular biology.

[5]  R. Nussinov,et al.  A 3D sequence-independent representation of the protein data bank. , 1995, Protein engineering.

[6]  Carlo Ferrari,et al.  A grid-aware approach to protein structure comparison , 2003, J. Parallel Distributed Comput..

[7]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[8]  Tim J. P. Hubbard,et al.  SCOP: a Structural Classification of Proteins database , 1999, Nucleic Acids Res..

[9]  Giuseppe Zanotti,et al.  Global secondary structure packing angle bias in proteins , 2003, Proteins.

[10]  Christian Lemmen,et al.  Computational methods for the structural alignment of molecules , 2000, J. Comput. Aided Mol. Des..

[11]  M Levitt,et al.  Comprehensive assessment of automatic structural alignment against a manual standard, the scop classification of proteins , 1998, Protein science : a publication of the Protein Society.

[12]  Mark Gerstein,et al.  A resolution-sensitive procedure for comparing protein surfaces and its application to the comparison of antigen-combining sites , 1992 .

[13]  W. Taylor Protein structure comparison using iterated double dynamic programming , 2008, Protein science : a publication of the Protein Society.

[14]  Yehezkel Lamdan,et al.  Affine invariant model-based object recognition , 1990, IEEE Trans. Robotics Autom..

[15]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[16]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[17]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[18]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[19]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[20]  H. Wolfson,et al.  Detection of non-topological motifs in protein structures. , 1996, Protein engineering.

[21]  Chris Sander,et al.  The FSSP database: fold classification based on structure-structure alignment of proteins , 1996, Nucleic Acids Res..

[22]  Robert D. Carr,et al.  101 optimal PDB structure alignments: a branch-and-cut algorithm for the maximum contact map overlap problem , 2001, RECOMB.

[23]  Carlo Ferrari,et al.  Geometric Methods for Protein Structure Comparison , 2003, Mathematical Methods for Protein Structure Analysis and Design.

[24]  Chris Sander,et al.  3-D Lookup: Fast Protein Structure Database Searches at 90% Reliability , 1995, ISMB.

[25]  Gerard J Kleywegt,et al.  Evaluation of protein fold comparison servers , 2003, Proteins.

[26]  Stefano Lonardi,et al.  Analysis of secondary structure elements of proteins using indexing techniques , 2002, Proceedings. First International Symposium on 3D Data Processing Visualization and Transmission.

[27]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[28]  R. Abagyan,et al.  An automatic search for similar spatial arrangements of alpha-helices and beta-strands in globular proteins. , 1989, Journal of biomolecular structure & dynamics.

[29]  John P. Overington,et al.  Molecular recognition in protein families: a database of aligned three-dimensional structures of related proteins. , 1993, Biochemical Society transactions.

[30]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[31]  Douglas L. Brutlag,et al.  Hierarchical Protein Structure Superposition Using Both Secondary Structure and Atomic Representations , 1997, ISMB.

[32]  T. Blundell,et al.  Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming. , 1990, Journal of molecular biology.