LigASite—a database of biologically relevant binding sites in proteins with known apo-structures

Better characterization of binding sites in proteins and the ability to accurately predict their location and energetic properties are major challenges which, if addressed, would have many valuable practical applications. Unfortunately, reliable benchmark datasets of binding sites in proteins are still sorely lacking. Here, we present LigASite (‘LIGand Attachment SITE’), a gold-standard dataset of binding sites in 550 proteins of known structures. LigASite consists exclusively of biologically relevant binding sites in proteins for which at least one apo- and one holo-structure are available. In defining the binding sites for each protein, information from all holo-structures is combined, considering in each case the quaternary structure defined by the PQS server. LigASite is built using simple criteria and is automatically updated as new structures become available in the PDB, thereby guaranteeing optimal data coverage over time. Both a redundant and a culled non-redundant version of the dataset is available at http://www.scmbb.ulb.ac.be/Users/benoit/LigASite. The website interface allows users to search the dataset by PDB identifiers, ligand identifiers, protein names or sequence, and to look for structural matches as defined by the CATH homologous superfamilies. The datasets can be downloaded from the website as Schema-validated XML files or comma-separated flat files.

[1]  G. Schulz,et al.  Adenylate kinase motions during catalysis: an energetic counterweight balancing substrate binding. , 1996, Structure.

[2]  Patricia C. Babbitt,et al.  Automated discovery of 3D motifs for protein function annotation , 2006, Bioinform..

[3]  J. Thornton,et al.  Predicting protein function from sequence and structural data. , 2005, Current opinion in structural biology.

[4]  E A Merritt,et al.  Raster3D: photorealistic molecular graphics. , 1997, Methods in enzymology.

[5]  J M Thornton,et al.  LIGPLOT: a program to generate schematic diagrams of protein-ligand interactions. , 1995, Protein engineering.

[6]  Janet M. Thornton,et al.  PDBsum more: new summaries and analyses of the known 3D structures of proteins and nucleic acids , 2004, Nucleic Acids Res..

[7]  Didier Rognan,et al.  sc-PDB: an Annotated Database of Druggable Binding Sites from the Protein Data Bank , 2006, J. Chem. Inf. Model..

[8]  Shoshana J. Wodak,et al.  Relating destabilizing regions to known functional sites in proteins , 2007, BMC Bioinformatics.

[9]  Daniel J Rigden,et al.  Understanding the cell in terms of structure and function: insights from structural genomics. , 2006, Current opinion in biotechnology.

[10]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[11]  Sherry L. Mowbray,et al.  Hinge-bending Motion of d-Allose-binding Protein from Escherichia coli , 2002, The Journal of Biological Chemistry.

[12]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[13]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[14]  Richard M. Jackson,et al.  Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites , 2005, Bioinform..

[15]  Ronald W. Davis,et al.  Allele quantification using molecular inversion probes (MIP) , 2005, Nucleic acids research.

[16]  Nicola D. Gold,et al.  SitesBase: a database for structure-based protein–ligand binding site comparisons , 2005, Nucleic Acids Res..

[17]  Rafael Najmanovich,et al.  Side‐chain flexibility in proteins upon ligand binding , 2000, Proteins.

[18]  Jacquelyn S. Fetrow,et al.  Structural genomics and its importance for gene function analysis , 2000, Nature Biotechnology.

[19]  Frances M. G. Pearl,et al.  The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution , 2006, Nucleic Acids Res..

[20]  Alfonso Valencia,et al.  FireDB—a database of functionally important residues from proteins of known structure , 2006, Nucleic Acids Res..

[21]  Jaime Prilusky,et al.  Automated analysis of interatomic contacts in proteins , 1999, Bioinform..

[22]  Robin Taylor,et al.  A new test set for validating predictions of protein–ligand interaction , 2002, Proteins.

[23]  Andrew C. R. Martin PDBSprotEC: a Web-accessible database linking PDB chains to EC numbers via SwissProt , 2004, Bioinform..

[24]  J. Thornton,et al.  Searching for functional sites in protein structures. , 2004, Current opinion in chemical biology.

[25]  Vladimir A. Ivanisenko,et al.  PDBSite: a database of the 3D structure of protein functional sites , 2004, Nucleic Acids Res..

[26]  P. Kraulis A program to produce both detailed and schematic plots of protein structures , 1991 .

[27]  J. Thornton,et al.  A method for localizing ligand binding pockets in protein structures , 2005, Proteins.

[28]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[29]  Janet M. Thornton,et al.  The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data , 2004, Nucleic Acids Res..

[30]  J. Thornton,et al.  PQS: a protein quaternary structure file server. , 1998, Trends in biochemical sciences.