Structural Classification of Phosphate Binding Sites in Protein-Nucleotide Complexes: An Automated All-Against-All Structural Comparison Using Geometric Matching

A method is described for the rapid comparison of protein binding sites using geometric matching to detect similar three‐dimensional structure. The geometric matching detects common atomic features through identification of the maximum common sub‐graph or clique. These features are not necessarily evident from sequence or from global structural similarity giving additional insight into molecular recognition not evident from current sequence or structural classification schemes. Here we use the method to produce an all‐against‐all comparison of phosphate binding sites in a number of different nucleotide phosphate‐binding proteins. The similarity search is combined with clustering of similar sites to allow a preliminary structural classification. Clustering by site similarity produces a classification of binding sites for the 476 representative local environments producing ten main clusters representing half of the representative environments. The similarities make sense in terms of both structural and functional classification schemes. The ten main clusters represent a very limited number of unique structural binding motifs for phosphate. These are the structural P‐loop, di‐nucleotide binding motif [FAD/NAD(P)‐binding and Rossman‐like fold] and FAD‐binding motif. Similar classification schemes for nucleotide binding proteins have also been arrived at independently by others using different methods. Proteins 2004. © 2004 Wiley‐Liss, Inc.

[1]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[2]  R. Mojena,et al.  Hierarchical Grouping Methods and Stopping Rules: An Evaluation , 1977, Comput. J..

[3]  A. Mclachlan Gene duplications in the structural evolution of chymotrypsin. , 1979, Journal of molecular biology.

[4]  G. Schulz Binding of nucleotides by proteins , 1992, Current Biology.

[5]  M. Swindells Classification of doubly wound nucleotide binding topologies using automated loop searches , 1993, Protein science : a publication of the Protein Society.

[6]  D Fischer,et al.  A computer vision based technique for 3-D sequence-independent structural comparison of proteins. , 1993, Protein engineering.

[7]  Rainer Fuchs,et al.  Predicting protein function: a versatile tool for the Apple Macintosh , 1994, Comput. Appl. Biosci..

[8]  P. Willett,et al.  A graph-theoretic approach to the identification of three-dimensional patterns of amino acid side-chains in protein structures. , 1994, Journal of molecular biology.

[9]  T. Traut The functions and consensus motifs of nine types of peptide segments that form different types of nucleotide-binding sites. , 1994, European journal of biochemistry.

[10]  G. Barton,et al.  A structural analysis of phosphate and sulphate binding sites in proteins. Estimation of propensities for binding and conservation of phosphate binding sites. , 1994, Journal of molecular biology.

[11]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[12]  R A Sayle,et al.  RASMOL: biomolecular graphics for all. , 1995, Trends in biochemical sciences.

[13]  A G Murzin,et al.  Structural classification of proteins: new superfamilies. , 1996, Current opinion in structural biology.

[14]  Chris P. Ponting,et al.  The helix-hairpin-helix DNA-binding motif: a structural basis for non- sequence-specific recognition of DNA , 1996, Nucleic Acids Res..

[15]  Andrew J. Martin,et al.  Structural families in loops of homologous proteins: automatic classification, modelling and application to antibodies. , 1996, Journal of molecular biology.

[16]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[17]  J. Thornton,et al.  Tess: A geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites , 1997, Protein science : a publication of the Protein Society.

[18]  M. Eppink,et al.  Identification of a novel conserved sequence motif in flavoprotein hydroxylases with a putative dual function in FAD/NAD(P)H binding , 1997, Protein science : a publication of the Protein Society.

[19]  Chris Sander,et al.  Dali/FSSP classification of three-dimensional protein folds , 1997, Nucleic Acids Res..

[20]  R. Russell,et al.  Detection of protein three-dimensional side-chain patterns: new examples of convergent evolution. , 1998, Journal of molecular biology.

[21]  Nicholas Ayache,et al.  A geometric algorithm to find small but highly similar 3D substructures in proteins , 1998, Bioinform..

[22]  M. Helmer-Citterich,et al.  Three-dimensional profiles: a new tool to identify protein surface similarities. , 1998, Journal of molecular biology.

[23]  Jukka V. Lehtonen,et al.  Finding local structural similarities among families of unrelated protein structures: A generic non‐linear alignment algorithm , 1999, Proteins.

[24]  N Go,et al.  Structural motif of phosphate-binding site common to various protein superfamilies: all-against-all structural comparison of protein-mononucleotide complexes. , 1999, Protein engineering.

[25]  J M Thornton,et al.  Three-dimensional structure analysis of PROSITE patterns. , 1999, Journal of molecular biology.

[26]  R M Jackson,et al.  The serine protease inhibitor canonical loop conformation: examples found in extracellular hydrolases, toxins, cytokines and viral proteins. , 2000, Journal of molecular biology.

[27]  A Valencia,et al.  Three-dimensional view of the surface motif associated with the P-loop structure: cis and trans cases of convergent evolution. , 2000, Journal of molecular biology.

[28]  M. Sternberg,et al.  Automated structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking. , 2001, Journal of molecular biology.

[29]  K. Denessiouk,et al.  Adenine recognition: A motif present in ATP‐, CoA‐, NAD‐, NADP‐, and FAD‐dependent proteins , 2001, Proteins.

[30]  C. Ponting,et al.  On the evolution of protein folds: are similar motifs in different protein folds the result of convergence, insertion, or relics of an ancient peptide world? , 2001, Journal of structural biology.

[31]  Amos Bairoch,et al.  The PROSITE database, its status in 2002 , 2002, Nucleic Acids Res..

[32]  T J Oldfield,et al.  Data mining the protein data bank: Residue interactions , 2002, Proteins.

[33]  Terri K. Attwood,et al.  PRINTS and PRINTS-S shed light on protein ancestry , 2002, Nucleic Acids Res..

[34]  John J Tanner,et al.  A structurally conserved water molecule in Rossmann dinucleotide‐binding domains , 2002, Protein science : a publication of the Protein Society.

[35]  M. Jambon,et al.  A new bioinformatic approach to detect common 3D sites in protein structures , 2003, Proteins.

[36]  Robert B Russell,et al.  A model for statistical significance of local similarities in structure. , 2003, Journal of molecular biology.

[37]  Janet M Thornton,et al.  Using structural motif templates to identify proteins with DNA binding function. , 2003, Nucleic acids research.

[38]  Robert B. Russell,et al.  Annotation in three dimensions. PINTS: Patterns in Non-homologous Tertiary Structures , 2003, Nucleic Acids Res..