Data Deposition and Annotation at the Worldwide Protein Data Bank

The Protein Data Bank (PDB) is the repository for three-dimensional structures of biological macromolecules, determined by experimental methods. The data in the archive is free and easily available via the Internet from any of the worldwide centers managing this global archive. These data are used by scientists, researchers, bioinformatics specialists, educators, students, and general audiences to understand biological phenomenon at a molecular level. Analysis of this structural data also inspires and facilitates new discoveries in science. This chapter describes the tools and methods currently used for deposition, processing, and release of data in the PDB. References to future enhancements are also included.

[1]  C. Sander,et al.  Errors in protein structures , 1996, Nature.

[2]  Haruki Nakamura,et al.  The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data , 2006, Nucleic Acids Res..

[3]  E. Ciszak,et al.  How Dihydrolipoamide Dehydrogenase-binding Protein Binds Dihydrolipoamide Dehydrogenase in the Human Pyruvate Dehydrogenase Complex* , 2006, Journal of Biological Chemistry.

[4]  Wolf-Dietrich Ihlenfeldt,et al.  Computation and management of chemical properties in CACTVS: An extensible networked approach toward modularity and compatibility , 1994, J. Chem. Inf. Comput. Sci..

[5]  T. A. Jones,et al.  The Uppsala Electron-Density Server. , 2004, Acta crystallographica. Section D, Biological crystallography.

[6]  Frank Oellien,et al.  Enhanced CACTVS Browser of the Open NCI Database , 2002, J. Chem. Inf. Comput. Sci..

[7]  R A Sayle,et al.  RASMOL: biomolecular graphics for all. , 1995, Trends in biochemical sciences.

[8]  T. Earnest,et al.  Crystal Structure of the Ribosome at 5.5 Å Resolution , 2001, Science.

[9]  E. Ulrich,et al.  Creation of a nuclear magnetic resonance data repository and literature database. , 1989, Protein sequences & data analysis.

[10]  Hideaki Sugawara,et al.  DDBJ with new system and face , 2007, Nucleic Acids Res..

[11]  John D. Westbrook,et al.  Specification of a relational dictionary definition language (DDL2) , 2006 .

[12]  K Henrick,et al.  EMDep: a web-based system for the deposition and validation of high-resolution electron microscopy macromolecular structural information. , 2003, Journal of structural biology.

[13]  Kengo Kinoshita,et al.  eF-site and PDBjViewer: database and viewer for protein functional sites , 2004, Bioinform..

[14]  Zukang Feng,et al.  Validation of protein structures for protein data bank. , 2003, Methods in enzymology.

[15]  P J Briggs,et al.  Ongoing developments in CCP4 for high-throughput structure determination. , 2002, Acta crystallographica. Section D, Biological crystallography.

[16]  Brian McMahon,et al.  Definition and exchange of crystallographic data , 2005 .

[17]  Philip E. Bourne,et al.  The Macromolecular Crystallographic Information File (mmCIF) , 2001 .

[18]  J. Thornton,et al.  PROCHECK: a program to check the stereochemical quality of protein structures , 1993 .

[19]  Qing Zhang,et al.  The RCSB Protein Data Bank: a redesigned query system and relational database based on the mmCIF schema , 2004, Nucleic Acids Res..

[20]  Roland L Dunbrack,et al.  Outcome of a workshop on archiving structural models of biological macromolecules. , 2006, Structure.

[21]  K. Wüthrich,et al.  Recommendations for the presentation of NMR structures of proteins and nucleic acids – IUPAC-IUBMB-IUPAB Inter-Union Task Group on the Standardization of Data Bases of Protein and Nucleic Acid Structures Determined by NMR Spectroscopy , 1998, European journal of biochemistry.

[22]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[23]  Philip E. Bourne,et al.  [30] Macromolecular crystallographic information file , 1997 .

[24]  Dan Wu,et al.  Priorities for nucleotide trace, sequence and annotation data capture at the Ensembl Trace Archive and the EMBL Nucleotide Sequence Database , 2007, Nucleic Acids Res..

[25]  D. Rice,et al.  Comparison of the three-dimensional structures of recombinant human H and horse L ferritins at high resolution. , 1997, Journal of molecular biology.

[26]  S J Wodak,et al.  SFCHECK: a unified set of procedures for evaluating the quality of macromolecular structure-factor data and their agreement with the atomic model. , 1999, Acta crystallographica. Section D, Biological crystallography.

[27]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[28]  J. Thornton,et al.  PQS: a protein quaternary structure file server. , 1998, Trends in biochemical sciences.

[29]  Michael J. Hartshorn,et al.  AstexViewerTM †: a visualisation aid for structure-based drug design , 2002, J. Comput. Aided Mol. Des..

[30]  K. Gustafson,et al.  Isolation and Characterization of Novel Cyclotides from Viola hederaceae , 2005, Journal of Biological Chemistry.

[31]  Alexandre M J J Bonvin,et al.  BioMagResBank databases DOCR and FRED containing converted and filtered sets of experimental NMR restraints and coordinates from over 500 protein PDB structures , 2005, Journal of biomolecular NMR.

[32]  Haruki Nakamura,et al.  Announcing the worldwide Protein Data Bank , 2003, Nature Structural Biology.

[33]  Philip E. Bourne,et al.  Macromolecular dictionary (mmCIF) , 2006 .

[34]  Haruki Nakamura,et al.  PDBML: the representation of archival macromolecular structure data in XML , 2005, Bioinform..

[35]  Sameer Velankar,et al.  E-MSD: improving data deposition and structure quality , 2005, Nucleic Acids Res..

[36]  Hiroshi Wako,et al.  ProMode: a database of normal mode analyses on protein molecules with a full-atom model , 2004, Bioinform..

[37]  H. Mooney,et al.  Nature's Subsidies to Shrimp and Salmon Farming , 1998, Science.

[38]  Sameer Velankar,et al.  E-MSD: an integrated data resource for bioinformatics , 2004, Nucleic Acids Res..

[39]  Ian W. Davis,et al.  Structure validation by Cα geometry: ϕ,ψ and Cβ deviation , 2003, Proteins.

[40]  Philip E. Bourne,et al.  The RCSB PDB information portal for structural genomics , 2005, Nucleic Acids Res..

[41]  John D. Westbrook,et al.  TargetDB: a target registration database for structural genomics projects , 2004, Bioinform..

[42]  R. Stevens,et al.  Global Efforts in Structural Genomics , 2001, Science.

[43]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[44]  Haruki Nakamura,et al.  GASH: An improved algorithm for maximizing the number of equivalent residues between two protein structures , 2005, BMC Bioinformatics.

[45]  K. Henrick,et al.  Inference of macromolecular assemblies from crystalline state. , 2007, Journal of molecular biology.

[46]  G J Kleywegt,et al.  Phi/psi-chology: Ramachandran revisited. , 1996, Structure.

[47]  Li Xueli,et al.  Design of a data model for developing laboratory information management and analysis systems for protein production , 2004, Proteins.

[48]  Zukang Feng,et al.  Automated and accurate deposition of structures solved by X-ray diffraction to the Protein Data Bank. , 2004, Acta crystallographica. Section D, Biological crystallography.

[49]  Helen M Berman,et al.  Large macromolecular complexes in the Protein Data Bank: a status report. , 2005, Structure.

[50]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.