Similarity networks of protein binding sites

An increasing attention has been dedicated to the characterization of complex networks within the protein world. This work is reporting how we uncovered networked structures that reflected the structural similarities among protein binding sites. First, a 211 binding sites dataset has been compiled by removing the redundant proteins in the Protein Ligand Database (PLD) (http://www‐mitchell.ch.cam.ac.uk/pld/). Using a clique detection algorithm we have performed all‐against‐all binding site comparisons among the 211 available ones. Within the set of nodes representing each binding site an edge was added whenever a pair of binding sites had a similarity higher than a threshold value. The generated similarity networks revealed that many nodes had few links and only few were highly connected, but due to the limited data available it was not possible to definitively prove a scale‐free architecture. Within the same dataset, the binding site similarity networks were compared with the networks of sequence and fold similarity networks. In the protein world, indications were found that structure is better conserved than sequence, but on its own, sequence was better conserved than the subset of functional residues forming the binding site. Because a binding site is strongly linked with protein function, the identification of protein binding site similarity networks could accelerate the functional annotation of newly identified genes. In view of this we have discussed several potential applications of binding site similarity networks, such as the construction of novel binding site classification databases, as well as the implications for protein molecular design in general and computational chemogenomics in particular. Proteins 2006. © 2005 Wiley‐Liss, Inc.

[1]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[2]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[3]  P. Willett,et al.  A graph-theoretic approach to the identification of three-dimensional patterns of amino acid side-chains in protein structures. , 1994, Journal of molecular biology.

[4]  J. Thornton,et al.  Tess: A geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites , 1997, Protein science : a publication of the Protein Society.

[5]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[6]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[7]  Annabel E. Todd,et al.  From protein structure to function. , 1999, Current opinion in structural biology.

[8]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[9]  M. Gerstein,et al.  Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. , 2000, Journal of molecular biology.

[10]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[11]  R. Levy,et al.  Simplified amino acid alphabets for protein fold recognition and implications for folding. , 2000, Protein engineering.

[12]  S. Wuchty Scale-free behavior in protein domain networks. , 2001, Molecular biology and evolution.

[13]  Jong H. Park,et al.  Mapping protein family interactions: intramolecular and intermolecular protein family interaction repertoires in the PDB and yeast. , 2001, Journal of molecular biology.

[14]  S. Strogatz Exploring complex networks , 2001, Nature.

[15]  M. Gerstein,et al.  Protein family and fold occurrence in genomes: power-law behaviour and evolutionary model. , 2001, Journal of molecular biology.

[16]  K. Sneppen,et al.  Specificity and Stability in Topology of Protein Networks , 2002, Science.

[17]  M Karplus,et al.  Small-world view of the amino acids that play a key role in protein folding. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[19]  C. Orengo,et al.  One fold with many functions: the evolutionary relationships between TIM barrel families based on their sequences, structures and functions. , 2002, Journal of molecular biology.

[20]  Eugene I Shakhnovich,et al.  Expanding protein universe and its origin from the biological Big Bang , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[21]  G. Klebe,et al.  A new method to detect related function among proteins independent of sequence and fold homology. , 2002, Journal of molecular biology.

[22]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[23]  M. Jambon,et al.  A new bioinformatic approach to detect common 3D sites in protein structures , 2003, Proteins.

[24]  John B. O. Mitchell,et al.  Protein Ligand Database (PLD): additional understanding of the nature and specificity of protein-ligand complexes , 2003, Bioinform..

[25]  Jie Liang,et al.  Inferring functional relationships of proteins from local sequence and spatial surface patterns. , 2003, Journal of molecular biology.

[26]  K. Kinoshita,et al.  Identification of protein biochemical functions by similarity search using the molecular surface database eF‐site , 2003, Protein science : a publication of the Protein Society.

[27]  Robert B Russell,et al.  A model for statistical significance of local similarities in structure. , 2003, Journal of molecular biology.

[28]  Victoria A. Higman,et al.  Uncovering network systems within protein structures. , 2003, Journal of molecular biology.

[29]  H. Wolfson,et al.  Recognition of Functional Sites in Protein Structures☆ , 2004, Journal of Molecular Biology.

[30]  Loren L Looger,et al.  Computational Design of a Biologically Active Enzyme , 2004, Science.

[31]  Reinhard Sterner,et al.  De Novo Design of an Enzyme , 2004, Science.

[32]  Gabriele Ausiello,et al.  SURFACE: a database of protein surface regions for functional annotation , 2004, Nucleic Acids Res..

[33]  S. Wuchty Evolution and topology in the yeast protein interaction network. , 2004, Genome research.

[34]  Biochemistry. De novo design of an enzyme. , 2004, Science.

[35]  Janet M. Thornton,et al.  From protein structure to biochemical function? , 2004, Journal of Structural and Functional Genomics.

[36]  Arun K. Ramani,et al.  Protein interaction networks from yeast to human. , 2004, Current opinion in structural biology.

[37]  Jordi Mestres,et al.  Computational chemogenomics approaches to systematic knowledge-based drug discovery. , 2004, Current opinion in drug discovery & development.

[38]  Gil Amitai,et al.  Network analysis of protein structures identifies functional residues. , 2004, Journal of molecular biology.

[39]  Eyke Hüllermeier,et al.  Efficient similarity search in protein structure databases by k-clique hashing , 2004, Bioinform..

[40]  F. Rao,et al.  The protein folding network. , 2004, Journal of molecular biology.

[41]  M. Grigorov Global properties of biological networks. , 2005, Drug discovery today.

[42]  Janet M. Thornton,et al.  Real spherical harmonic expansion coefficients as 3D shape descriptors for protein binding pocket and ligand comparisons , 2005, Bioinform..

[43]  Yang Zhang,et al.  The protein structure prediction problem could be solved using the current PDB library. , 2005, Proceedings of the National Academy of Sciences of the United States of America.