Ligand-binding site prediction of proteins based on known fragment–fragment interactions

Motivation: The identification of putative ligand-binding sites on proteins is important for the prediction of protein function. Knowledge-based approaches using structure databases have become interesting, because of the recent increase in structural information. Approaches using binding motif information are particularly effective. However, they can only be applied to well-known ligands that frequently appear in the structure databases. Results: We have developed a new method for predicting the binding sites of chemically diverse ligands, by using information about the interactions between fragments. The selection of the fragment size is important. If the fragments are too small, then the patterns derived from the binding motifs cannot be used, since they are many-body interactions, while using larger fragments limits the application to well-known ligands. In our method, we used the main and side chains for proteins, and three successive atoms for ligands, as fragments. After superposition of the fragments, our method builds the conformations of ligands and predicts the binding sites. As a result, our method could accurately predict the binding sites of chemically diverse ligands, even though the Protein Data Bank currently contains a large number of nucleotides. Moreover, a further evaluation for the unbound forms of proteins revealed that our building up procedure was robust to conformational changes induced by ligand binding. Availability: Our method, named ‘BUMBLE’, is available at http://bumble.hgc.jp/ Contact: kasahara@cb.k.u-tokyo.ac.jp Supplementary information: Supplementary Material is available at Bioinformatics online.

[1]  Richard M. Jackson,et al.  Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites , 2005, Bioinform..

[2]  Marit Kristiansen,et al.  Identification, synthesis, and characterization of new glycogen phosphorylase inhibitors binding to the allosteric AMP site. , 2004, Journal of medicinal chemistry.

[3]  Haruki Nakamura,et al.  Announcing the worldwide Protein Data Bank , 2003, Nature Structural Biology.

[4]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[5]  David S. Goodsell,et al.  AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility , 2009, J. Comput. Chem..

[6]  David A. Lee,et al.  Predicting protein function from sequence and structure , 2007, Nature Reviews Molecular Cell Biology.

[7]  M. Go,et al.  An empirical approach for detecting nucleotide-binding sites on proteins. , 2006, Protein engineering, design & selection : PEDS.

[8]  Le Kang,et al.  Characterization and comparative profiling of the small RNA transcriptomes in two phases of locust , 2009, Genome Biology.

[9]  Pieter F. W. Stouten,et al.  Fast prediction and visualization of protein binding pockets with PASS , 2000, J. Comput. Aided Mol. Des..

[10]  Robin Taylor,et al.  SuperStar: a knowledge-based approach for identifying interaction sites in proteins. , 1999, Journal of molecular biology.

[11]  C. Orengo,et al.  Protein function annotation by homology-based inference , 2009, Genome Biology.

[12]  Annabel E. Todd,et al.  From structure to function: Approaches and limitations , 2000, Nature Structural Biology.

[13]  N. Go,et al.  ATP binding proteins with different folds share a common ATP-binding structural motif , 1997, Nature Structural Biology.

[14]  S. J. Campbell,et al.  Ligand binding: functional site location, similarity and docking. , 2003, Current opinion in structural biology.

[15]  N Go,et al.  Structural motif of phosphate-binding site common to various protein superfamilies: all-against-all structural comparison of protein-mononucleotide complexes. , 1999, Protein engineering.

[16]  Shugo Nakamura,et al.  Highly accurate method for ligand‐binding site prediction in unbound state (apo) protein structures , 2008, Proteins.

[17]  Takashi Yamane,et al.  An empirical approach for structure-based prediction of carbohydrate-binding sites on proteins. , 2003, Protein engineering.

[18]  Kengo Kinoshita,et al.  Protein informatics towards function identification. , 2003, Current opinion in structural biology.

[19]  M. Schroeder,et al.  LIGSITEcsc: predicting ligand binding sites using the Connolly surface and degree of conservation , 2006, BMC Structural Biology.

[20]  M. Karplus,et al.  Multiple copy simultaneous search and construction of ligands in binding sites: application to inhibitors of HIV-1 aspartic proteinase. , 1993, Journal of medicinal chemistry.

[21]  K. Kinoshita,et al.  Identification of the ligand binding sites on the molecular surface of proteins , 2005, Protein science : a publication of the Protein Society.

[22]  Collin M. Stultz,et al.  The multi-copy simultaneous search methodology: a fundamental tool for structure-based drug design , 2009, J. Comput. Aided Mol. Des..

[23]  K. Denessiouk,et al.  Adenine recognition: A motif present in ATP‐, CoA‐, NAD‐, NADP‐, and FAD‐dependent proteins , 2001, Proteins.

[24]  Y. Martin,et al.  A general and fast scoring function for protein-ligand interactions: a simplified potential approach. , 1999, Journal of medicinal chemistry.

[25]  K. Denessiouk,et al.  When fold is not important: A common structural framework for adenine and AMP binding in 12 unrelated protein families , 2000, Proteins.

[26]  G. Klebe,et al.  Knowledge-based scoring function to predict protein-ligand interactions. , 2000, Journal of molecular biology.

[27]  M. Tress,et al.  Sequence-based feature prediction and annotation of proteins , 2009, Genome Biology.

[28]  Didier Rognan,et al.  sc-PDB: an Annotated Database of Druggable Binding Sites from the Protein Data Bank , 2006, J. Chem. Inf. Model..

[29]  Hiromi Nomura,et al.  Lumenal gating mechanism revealed in calcium pump crystal structures with phosphate analogues , 2004, Nature.

[30]  G. Klebe,et al.  Identification and mapping of small-molecule binding sites in proteins: computational tools for structure-based drug design. , 2002, Farmaco.

[31]  M. Verdonk,et al.  SuperStar: comparison of CSD and PDB-based interaction fields as a basis for the prediction of protein-ligand interactions. , 2001, Journal of molecular biology.

[32]  R. Cramer,et al.  Validation of the general purpose tripos 5.2 force field , 1989 .

[33]  M L Hackert,et al.  Three-dimensional structure of the Gly121Tyr dimeric form of ornithine decarboxylase from Lactobacillus 30a. , 1999, Acta crystallographica. Section D, Biological crystallography.