Exploring protein-ligand recognition with Binding MOAD.

We have recently announced the largest database of protein-ligand complexes, Binding MOAD (Mother of All Databases). After the August 2004 update, Binding MOAD contains 6816 complexes. There are 2220 protein families and 3316 unique ligands. After searching 6000+ crystallography papers, we have obtained binding data for 1793 (27%) of the complexes. We have also created a non-redundant set of complexes with only one complex from each protein family; in that set, 630 (28%) of the unique complexes have binding data. Here, we present information about the data provided at the Binding MOAD website. We also present the results of mining Binding MOAD to map the degree of solvent exposure for binding sites. We have determined that most cavities and ligands (70-85%) are well buried in the complexes. This fits with the common paradigm that a large degree of contact between the ligand and protein is significant in molecular recognition. GoCAV and the GoCAV viewer are the tools we created for this study. To share our data and make our online dataset more useful to other research groups, we have integrated the viewer into the Binding MOAD website (www.BindingMOAD.org).

[1]  Janet M. Thornton,et al.  PDBsum more: new summaries and analyses of the known 3D structures of proteins and nucleic acids , 2004, Nucleic Acids Res..

[2]  Renxiao Wang,et al.  The PDBbind database: methodologies and updates. , 2005, Journal of medicinal chemistry.

[3]  Gabriele Ausiello,et al.  SURFACE: a database of protein surface regions for functional annotation , 2004, Nucleic Acids Res..

[4]  R. Laskowski SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. , 1995, Journal of molecular graphics.

[5]  R. Nussinov,et al.  Protein–protein interactions: Structurally conserved residues distinguish between binding sites and exposed protein surfaces , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[6]  F. Javier Luque,et al.  Ligand-induced changes in the binding sites of proteins , 2002, Bioinform..

[7]  M. Sternberg,et al.  An analysis of conformational changes on protein-protein association: implications for predictive docking. , 1999, Protein engineering.

[8]  A J Olson,et al.  Analysis of a data set of paired uncomplexed protein structures: New metrics for side‐chain flexibility and model evaluation , 2001, Proteins.

[9]  G Klebe,et al.  Use of Relibase for retrieving complex three-dimensional interaction patterns including crystallographic packing effects. , 2001, Biopolymers.

[10]  Martin Stahl,et al.  The Use of Scoring Functions in Drug Discovery Applications , 2003 .

[11]  Yanli Wang,et al.  Molecular determinants for ATP-binding in proteins: a data mining and quantum chemical analysis. , 2004, Journal of molecular biology.

[12]  Michael G. Lerner,et al.  Binding MOAD (Mother Of All Databases) , 2005, Proteins.

[13]  B. Halle,et al.  Flexibility and packing in proteins , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Matthias Keil,et al.  Pattern recognition strategies for molecular surfaces: III. Binding site prediction with a neural network , 2004, J. Comput. Chem..

[15]  R Nussinov,et al.  A set of van der Waals and coulombic radii of protein atoms for molecular and solvent‐accessible surface calculation, packing evaluation, and docking , 1998, Proteins.

[16]  Adel Golovin,et al.  MSDsite: A database search and retrieval system for the analysis and viewing of bound ligands and active sites , 2004, Proteins.

[17]  Vladimir A. Ivanisenko,et al.  PDBSite: a database of the 3D structure of protein functional sites , 2004, Nucleic Acids Res..

[18]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[19]  Christian Hofbauer,et al.  SURFCOMP: A Novel Graph-Based Approach to Molecular Surface Comparison , 2004, J. Chem. Inf. Model..

[20]  Doo-Ho Cho,et al.  PDB-Ligand: a ligand database based on PDB for the automated and customized classification of ligand-binding structures , 2005, Nucleic Acids Res..

[21]  Jie Liang,et al.  CASTp: Computed Atlas of Surface Topography of proteins , 2003, Nucleic Acids Res..

[22]  M. L. Jones,et al.  PDBsum: a Web-based database of summaries and analyses of all PDB structures. , 1997, Trends in biochemical sciences.

[23]  W. L. Jorgensen,et al.  The OPLS [optimized potentials for liquid simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin. , 1988, Journal of the American Chemical Society.

[24]  F. M. Richards,et al.  Calculation of molecular volumes and areas for structures of known geometry. , 1985, Methods in enzymology.

[25]  A. Poupon Voronoi and Voronoi-related tessellations in studies of protein structure and interaction. , 2004, Current opinion in structural biology.

[26]  R. Abagyan,et al.  Pocketome via Comprehensive Identification and Classification of Ligand Binding Envelopes* , 2005, Molecular & Cellular Proteomics.

[27]  N. Paul,et al.  Recovering the true targets of specific ligands by virtual screening of the protein data bank , 2004, Proteins.

[28]  Chris M. W. Ho,et al.  Cavity search: An algorithm for the isolation and display of cavity-like binding regions , 1990, J. Comput. Aided Mol. Des..

[29]  C. Chothia The nature of the accessible and buried surfaces in proteins. , 1976, Journal of molecular biology.

[30]  N M Luscombe,et al.  New tools and resources for analysing protein structures and their interactions. , 1998, Acta crystallographica. Section D, Biological crystallography.

[31]  Qing Zhang,et al.  The RCSB Protein Data Bank: a redesigned query system and relational database based on the mmCIF schema , 2004, Nucleic Acids Res..

[32]  I. Kuntz,et al.  Using shape complementarity as an initial screen in designing ligands for a receptor binding site of known three-dimensional structure. , 1988, Journal of medicinal chemistry.

[33]  Roman A. Laskowski,et al.  PDBsum: summaries and analyses of PDB structures , 2001, Nucleic Acids Res..

[34]  G Schneider,et al.  Mapping of protein surface cavities and prediction of enzyme class by a self-organizing neural network. , 2000, Protein engineering.

[35]  John B. Anderson,et al.  MMDB: Entrez's 3D-structure database , 2002, Nucleic Acids Res..

[36]  A. Goede,et al.  Voronoi cell: New method for allocation of space among atoms: Elimination of avoidable errors in calculation of atomic volume and density , 1997 .

[37]  H. Edelsbrunner,et al.  Anatomy of protein pockets and cavities: Measurement of binding site geometry and implications for ligand design , 1998, Protein science : a publication of the Protein Society.

[38]  M. Swindells,et al.  Protein clefts in molecular recognition and function. , 1996, Protein science : a publication of the Protein Society.

[39]  Renxiao Wang,et al.  The PDBbind database: collection of binding affinities for protein-ligand complexes with known three-dimensional structures. , 2004, Journal of medicinal chemistry.

[40]  B. Lee,et al.  The interpretation of protein structures: estimation of static accessibility. , 1971, Journal of molecular biology.

[41]  Gerhard Klebe,et al.  Relibase: design and development of a database for comprehensive analysis of protein-ligand interactions. , 2003, Journal of molecular biology.

[42]  Jie Liang,et al.  Inferring functional relationships of proteins from local sequence and spatial surface patterns. , 2003, Journal of molecular biology.

[43]  Pieter F. W. Stouten,et al.  Fast prediction and visualization of protein binding pockets with PASS , 2000, J. Comput. Aided Mol. Des..

[44]  C. Chothia,et al.  The Packing Density in Proteins: Standard Radii and Volumes , 1999 .

[45]  Rafael Najmanovich,et al.  Side‐chain flexibility in proteins upon ligand binding , 2000, Proteins.

[46]  Brendan J. McConkey,et al.  Quantification of protein surfaces, volumes and atom-atom contacts using a constrained Voronoi procedure , 2002, Bioinform..

[47]  Gail J. Bartlett,et al.  Using a neural network and spatial clustering to predict the location of active sites in enzymes. , 2003, Journal of molecular biology.

[48]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[49]  M Hendlich,et al.  Databases for protein-ligand complexes. , 1998, Acta crystallographica. Section D, Biological crystallography.

[50]  H Edelsbrunner,et al.  Analytical shape computation of macromolecules: II. Inaccessible cavities in proteins , 1998, Proteins.

[51]  F M Richards,et al.  Protein packing: dependence on protein size, secondary structure and amino acid composition. , 2000, Journal of molecular biology.

[52]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[53]  D. Levitt,et al.  POCKET: a computer graphics method for identifying and displaying protein cavities and their surrounding amino acids. , 1992, Journal of molecular graphics.

[54]  Zukang Feng,et al.  Ligand Depot: a data warehouse for ligands bound to macromolecules , 2004, Bioinform..