Similarity search for local protein structures at atomic resolution by exploiting a database management system

A method to search for local structural similarities in proteins at atomic resolution is presented. It is demonstrated that a huge amount of structural data can be handled within a reasonable CPU time by using a conventional relational database management system with appropriate indexing of geometric data. This method, which we call geometric indexing, can enumerate ligand binding sites that are structurally similar to sub-structures of a query protein among more than 160,000 possible candidates within a few hours of CPU time on an ordinary desktop computer. After detecting a set of high scoring ligand binding sites by the geometric indexing search, structural alignments at atomic resolution are constructed by iteratively applying the Hungarian algorithm, and the statistical significance of the final score is estimated from an empirical model based on a gamma distribution. Applications of this method to several protein structures clearly shows that significant similarities can be detected between local structures of non-homologous as well as homologous proteins.

[1]  D. Ringe,et al.  Structure of chymotrypsin-trifluoromethyl ketone inhibitor complexes: comparison of slowly and rapidly equilibrating inhibitors. , 1990, Biochemistry.

[2]  Susan S. Taylor,et al.  2.2 A refined crystal structure of the catalytic subunit of cAMP-dependent protein kinase complexed with MnATP and a peptide inhibitor. , 1993, Acta crystallographica. Section D, Biological crystallography.

[3]  K. Kinoshita,et al.  Identification of protein biochemical functions by similarity search using the molecular surface database eF‐site , 2003, Protein science : a publication of the Protein Society.

[4]  N. Go,et al.  A method to search for similar protein local structures at ligand-binding sites and its application to adenine recognition , 1997, European Biophysics Journal.

[5]  R M Sweet,et al.  Crystal structure of casein kinase‐1, a phosphate‐directed protein kinase. , 1995, The EMBO journal.

[6]  R. Diamond A note on the rotational superposition problem , 1988 .

[7]  M. Jambon,et al.  A new bioinformatic approach to detect common 3D sites in protein structures , 2003, Proteins.

[8]  H. Wolfson,et al.  Efficient detection of three-dimensional structural motifs in biological macromolecules by computer vision techniques. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[9]  G. Murshudov,et al.  The structures and electronic configuration of compound I intermediates of Helicobacter pylori and Penicillium vitale catalases determined by X-ray crystallography and QM/MM density functional theory calculations. , 2007, Journal of the American Chemical Society.

[10]  J. Thornton,et al.  Tess: A geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites , 1997, Protein science : a publication of the Protein Society.

[11]  David P. Dobkin,et al.  The quickhull algorithm for convex hulls , 1996, TOMS.

[12]  R. Morris,et al.  On the Enzymatic Activation of NADH* , 2001, The Journal of Biological Chemistry.

[13]  Haim J. Wolfson,et al.  Geometric hashing: an overview , 1997 .

[14]  N Go,et al.  Structural motif of phosphate-binding site common to various protein superfamilies: all-against-all structural comparison of protein-mononucleotide complexes. , 1999, Protein engineering.

[15]  R. Jackson,et al.  Structural Classification of Phosphate Binding Sites in Protein-Nucleotide Complexes: An Automated All-Against-All Structural Comparison Using Geometric Matching , 2003 .

[16]  Haruki Nakamura,et al.  Announcing the worldwide Protein Data Bank , 2003, Nature Structural Biology.

[17]  Nguyen-Huu Xuong,et al.  Crystal structure of the catalytic subunit of cAMP-dependent protein kinase complexed with magnesium-ATP and peptide inhibitor , 1993 .

[18]  Nicholas Ayache,et al.  A geometric algorithm to find small but highly similar 3D substructures in proteins , 1998, Bioinform..

[19]  J. Thornton,et al.  Searching for functional sites in protein structures. , 2004, Current opinion in chemical biology.

[20]  K S Wilson,et al.  Crystal structure of the alkaline proteinase Savinase from Bacillus lentus at 1.4 A resolution. , 1992, Journal of molecular biology.

[21]  H. Wolfson,et al.  Recognition of Functional Sites in Protein Structures☆ , 2004, Journal of Molecular Biology.

[22]  L. Shapiro,et al.  Large conformational changes in the catalytic cycle of glutathione synthase. , 2002, Structure.

[23]  Kengo Kinoshita,et al.  eF-site and PDBjViewer: database and viewer for protein functional sites , 2004, Bioinform..

[24]  M. Ludwig,et al.  pH-dependent structural changes in the active site of p-hydroxybenzoate hydroxylase point to the importance of proton and water movements during catalysis. , 1996, Biochemistry.

[25]  Robert B Russell,et al.  A model for statistical significance of local similarities in structure. , 2003, Journal of molecular biology.

[26]  K S Wilson,et al.  Crystal structure of subtilisin DY, a random mutant of subtilisin Carlsberg. , 1998, European journal of biochemistry.

[27]  Angela N. Brooks,et al.  Structural Basis for Double-Stranded RNA Processing by Dicer , 2006, Science.

[28]  Jennifer Widom,et al.  Database Systems: The Complete Book , 2001 .

[29]  A Yonath,et al.  Crystal structures of complexes of the small ribosomal subunit with tetracycline, edeine and IF3 , 2001, The EMBO journal.

[30]  Haruki Nakamura,et al.  PDBML: the representation of archival macromolecular structure data in XML , 2005, Bioinform..

[31]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[32]  L. Pearl,et al.  Structure and specificity of the vertebrate anti-mutator uracil-DNA glycosylase SMUG1. , 2003, Molecular cell.