Rapid retrieval of protein structures from databases.

As protein databases continue to grow in size, exhaustive search methods that compare a query structure against every database structure can no longer provide satisfactory performance. Instead, the filter-and-refine paradigm offers an efficient alternative to database search without compromising the accuracy of the answers. In this paradigm, protein structures are represented in an abstract form. During querying, based on the abstract representations, the filtering phase prunes away dissimilar structures quickly so that only a small collection of promising structures are examined using a detailed structure alignment technique in the refinement phase. This article reviews mainly techniques developed for the filtering phase.

[1]  Thomas Lengauer,et al.  Detection of Distant Structural Similarities in a Set of Proteins Using a Fast Graph-Based Method , 1997, ISMB.

[2]  Adam Godzik,et al.  Flexible structure alignment by chaining aligned fragment pairs allowing twists , 2003, ECCB.

[3]  Andrew J. Martin,et al.  The ups and downs of protein topology; rapid comparison of protein structure. , 2000, Protein engineering.

[4]  Srinivasan Parthasarathy,et al.  Structure-based querying of proteins using wavelets , 2006, CIKM '06.

[5]  D Fischer,et al.  A computer vision based technique for 3-D sequence-independent structural comparison of proteins. , 1993, Protein engineering.

[6]  H. Wolfson,et al.  Efficient detection of three-dimensional structural motifs in biological macromolecules by computer vision techniques. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[7]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[8]  Jinn-Moon Yang,et al.  Protein structure database search and evolutionary classification , 2006, Nucleic acids research.

[9]  G. Kleywegt Use of non-crystallographic symmetry in protein structure refinement. , 1996, Acta crystallographica. Section D, Biological crystallography.

[10]  Oliviero Carugo,et al.  Rapid Methods for Comparing Protein Structures and Scanning Structure Databases , 2006 .

[11]  Frances M. G. Pearl,et al.  Recognizing the fold of a protein structure , 2003, Bioinform..

[12]  Finn Drabløs,et al.  Homology-based modelling of targets for rational drug design. , 2004, Mini reviews in medicinal chemistry.

[13]  Kian-Lee Tan,et al.  Towards Scaleable Protein Structure Comparison and Database Search , 2005, Int. J. Artif. Intell. Tools.

[14]  William R. Taylor,et al.  Structure Comparison and Structure Patterns , 2000, J. Comput. Biol..

[15]  P. Koehl,et al.  Protein structure similarities. , 2001, Current opinion in structural biology.

[16]  Inge Jonassen,et al.  Protein structure comparison and struc-ture patterns-an algorithmic approach , 2001 .

[17]  J. Gibrat,et al.  Protein secondary structure assignment revisited: a detailed analysis of different assignment methods , 2005, BMC Structural Biology.

[18]  Gordon M. Crippen,et al.  Distance Geometry and Molecular Conformation , 1988 .

[19]  Ambuj K. Singh,et al.  Index-based Similarity Search for Protein Structure Databases , 2004, J. Bioinform. Comput. Biol..

[20]  S. Pongor,et al.  Protein fold similarity estimated by a probabilistic approach based on C(alpha)-C(alpha) distance comparison. , 2002, Journal of molecular biology.

[21]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[22]  William R Taylor,et al.  Protein Structure Comparison Using Bipartite Graph Matching and Its Application to Protein Structure Classification * , 2002, Molecular & Cellular Proteomics.

[23]  P Willett,et al.  Identification of tertiary structure resemblance in proteins using a maximal common subgraph isomorphism algorithm. , 1993, Journal of molecular biology.

[24]  M. Gerstein Integrative database analysis in structural genomics , 2000, Nature Structural Biology.

[25]  Hans-Peter Kriegel,et al.  3D Shape Histograms for Similarity Search and Classification in Spatial Databases , 1999, SSD.

[26]  David R. Gilbert,et al.  Motif-based searching in TOPS protein topology databases , 1999, Bioinform..

[27]  Rachel Kolodny,et al.  Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. , 2005, Journal of molecular biology.

[28]  M. Levitt,et al.  A unified statistical framework for sequence comparison and structure comparison. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[29]  K Henrick,et al.  Electronic Reprint Biological Crystallography Secondary-structure Matching (ssm), a New Tool for Fast Protein Structure Alignment in Three Dimensions Biological Crystallography Secondary-structure Matching (ssm), a New Tool for Fast Protein Structure Alignment in Three Dimensions , 2022 .

[30]  Kian-Lee Tan,et al.  Rapid 3D protein structure database searching using information retrieval techniques , 2004, Bioinform..

[31]  Zi Huang,et al.  Dimensionality reduction in patch-signature based protein structure matching , 2006, ADC.

[32]  Feng Gao,et al.  PSIST: indexing protein structures using suffix trees , 2005, 2005 IEEE Computational Systems Bioinformatics Conference (CSB'05).

[33]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[34]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[35]  Giuseppe Lancia,et al.  Protein Structure Comparison: Algorithms and Applications , 2003, Mathematical Methods for Protein Structure Analysis and Design.

[36]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[37]  Douglas L. Brutlag,et al.  Hierarchical Protein Structure Superposition Using Both Secondary Structure and Atomic Representations , 1997, ISMB.

[38]  Malcolm P. Atkinson,et al.  A Database Index to Large Biological Sequences , 2001, VLDB.

[39]  W. Pearson,et al.  Sensitivity and selectivity in protein structure comparison , 2004, Protein science : a publication of the Protein Society.

[40]  P. Argos,et al.  Knowledge‐based protein secondary structure assignment , 1995, Proteins.

[41]  T. Blundell,et al.  Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming. , 1990, Journal of molecular biology.

[42]  Matteo Comin,et al.  PROuST: A Comparison Method of Three-Dimensional Structures of Proteins Using Indexing Techniques , 2004, J. Comput. Biol..

[43]  Gerard J Kleywegt,et al.  Evaluation of protein fold comparison servers , 2003, Proteins.

[44]  Jan Griebsch,et al.  PAST: fast structure-based searching in the PDB , 2006, Nucleic Acids Res..

[45]  L. R. Rasmussen,et al.  In information retrieval: data structures and algorithms , 1992 .

[46]  Iosif I Vaisman,et al.  A simple topological representation of protein structure: Implications for new, fast, and robust structural classification , 2004, Proteins.

[47]  Dong Xu,et al.  ProteinDBS: a real-time retrieval system for protein structure comparison , 2004, Nucleic Acids Res..

[48]  Wei-Ying Ma,et al.  Locality preserving indexing for document representation , 2004, SIGIR '04.

[49]  Gene H. Golub,et al.  Matrix computations , 1983 .

[50]  Tim J. P. Hubbard,et al.  SCOP: a Structural Classification of Proteins database , 1999, Nucleic Acids Res..

[51]  Hanan Samet,et al.  The Design and Analysis of Spatial Data Structures , 1989 .

[52]  T. Ohkawa,et al.  A method of comparing protein structures based on matrix representation of secondary structure pairwise topology , 1999, Proceedings 1999 International Conference on Information Intelligence and Systems (Cat. No.PR00446).

[53]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[54]  Sorin Istrail,et al.  Mathematical Methods for Protein Structure Analysis and Design , 2003, Lecture Notes in Computer Science.

[55]  Temple F. Smith,et al.  Comparison of biosequences , 1981 .

[56]  D. O’Leary,et al.  Secondary structure spatial conformation footprint: a novel method for fast protein structure comparison and classification , 2006, BMC Structural Biology.

[57]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[58]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[59]  Matteo Comin,et al.  PROuST: a server based comparison method of three-dimensional structures of proteins using indexing techniques , 2004 .

[60]  T. Gregory Dewey,et al.  Structure alignment based on coding of local geometric measures , 2006, BMC Bioinformatics.

[61]  Ming-Jing Hwang,et al.  Protein structure comparison by probability-based matching of secondary structure elements , 2003, Bioinform..

[62]  Eyke Hüllermeier,et al.  Efficient similarity search in protein structure databases by k-clique hashing , 2004, Bioinform..

[63]  Zhiping Weng,et al.  FAST: A novel protein structure alignment algorithm , 2004, Proteins.