Validation of Structures in the Protein Data Bank

Summary The Worldwide PDB recently launched a deposition, biocuration, and validation tool: OneDep. At various stages of OneDep data processing, validation reports for three-dimensional structures of biological macromolecules are produced. These reports are based on recommendations of expert task forces representing crystallography, nuclear magnetic resonance, and cryoelectron microscopy communities. The reports provide useful metrics with which depositors can evaluate the quality of the experimental data, the structural model, and the fit between them. The validation module is also available as a stand-alone web server and as a programmatically accessible web service. A growing number of journals require the official wwPDB validation reports (produced at biocuration) to accompany manuscripts describing macromolecular structures. Upon public release of the structure, the validation report becomes part of the public PDB archive. Geometric quality scores for proteins in the PDB archive have improved over the past decade.

[1]  Akira R. Kinjo,et al.  Protein Data Bank Japan (PDBj): updated user interfaces, resource description framework, analysis tools for large structures , 2016, Nucleic Acids Res..

[2]  Haruki Nakamura,et al.  Announcing the worldwide Protein Data Bank , 2003, Nature Structural Biology.

[3]  Ardan Patwardhan,et al.  EMPIAR: a public archive for raw electron microscopy image data , 2016, Nature Methods.

[4]  Randy J. Read,et al.  Overview of the CCP4 suite and current developments , 2011, Acta crystallographica. Section D, Biological crystallography.

[5]  Robert Huber,et al.  Structure quality and target parameters , 2006 .

[6]  M. Baker,et al.  Outcome of the First Electron Microscopy Validation Task Force Meeting , 2012, Structure.

[7]  Zukang Feng,et al.  The chemical component dictionary: complete descriptions of constituent molecules in experimentally determined 3D macromolecules in the Protein Data Bank , 2015, Bioinform..

[8]  Frank H. Allen,et al.  Cambridge Structural Database , 2002 .

[9]  Abhik Mukhopadhyay,et al.  PDBe: improved accessibility of macromolecular structure data from PDB and EMDB , 2015, Nucleic Acids Res..

[10]  Andreas Wilke,et al.  Functional analysis of metagenomes and metatranscriptomes using SEED and KEGG , 2011, BMC Bioinformatics.

[11]  H. Berman,et al.  New parameters for the refinement of nucleic acid-containing structures. , 1996, Acta crystallographica. Section D, Biological crystallography.

[12]  Haruki Nakamura,et al.  Outcome of the First wwPDB/CCDC/D3R Ligand Validation Workshop. , 2016, Structure.

[13]  Steven Teitelbaum,et al.  Where are the data? , 2011, Plastic and reconstructive surgery.

[14]  Randy J. Read,et al.  Acta Crystallographica Section D Biological , 2003 .

[15]  Michael Nilges,et al.  NMR Exchange Format: a unified and open standard for representation of NMR restraint data , 2015, Nature Structural &Molecular Biology.

[16]  Sameer Velankar,et al.  Implementing an X-ray validation pipeline for the Protein Data Bank , 2012, Acta crystallographica. Section D, Biological crystallography.

[17]  G. Murshudov,et al.  Refinement of macromolecular structures by the maximum-likelihood method. , 1997, Acta crystallographica. Section D, Biological crystallography.

[18]  G. Montelione,et al.  Recommendations of the wwPDB NMR Validation Task Force. , 2013, Structure.

[19]  Gerard J Kleywegt,et al.  Homo crystallographicus--quo vadis? , 2002, Structure.

[20]  A. Brunger Free R value: a novel statistical quantity for assessing the accuracy of crystal structures. , 1992 .

[21]  Huanwang Yang,et al.  Multivariate Analyses of Quality Metrics for Crystal Structures in the PDB Archive. , 2017, Structure.

[22]  T. A. Jones,et al.  The Uppsala Electron-Density Server. , 2004, Acta crystallographica. Section D, Biological crystallography.

[23]  G. N. Ramachandran,et al.  Stereochemistry of polypeptide chain configurations. , 1963, Journal of molecular biology.

[24]  S. H. YÜ,et al.  Determination of Absolute from Relative X-Ray Intensity Data , 1942, Nature.

[25]  F. Maia The Coherent X-ray Imaging Data Bank , 2012, Nature Methods.

[26]  Jie Luo,et al.  Retrieval of Crystallographically-Derived Molecular Geometry Information , 2004, J. Chem. Inf. Model..

[27]  A. Brünger Free R value: a novel statistical quantity for assessing the accuracy of crystal structures , 1992, Nature.

[28]  P. Emsley,et al.  Features and development of Coot , 2010, Acta crystallographica. Section D, Biological crystallography.

[29]  John D. Westbrook,et al.  DCC: a Swiss army knife for structure factor analysis and validation , 2016, Journal of applied crystallography.

[30]  Gerard J. Kleywegt,et al.  Web-based visualisation and analysis of 3D electron-microscopy data from EMDB and PDB , 2013, Journal of structural biology.

[31]  Randy J. Read,et al.  A New Generation of Crystallographic Validation Tools for the Protein Data Bank , 2011, Structure.

[32]  David S Wishart,et al.  A probabilistic approach for validating protein NMR chemical shift assignments , 2010, Journal of biomolecular NMR.

[33]  Naohiro Kobayashi,et al.  OneDep: Unified wwPDB System for Deposition, Biocuration, and Validation of Macromolecular Structures in the PDB Archive. , 2017, Structure.

[34]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[35]  Fei Long,et al.  The PDB_REDO server for macromolecular structure model optimization , 2014, IUCrJ.

[36]  David S Wishart,et al.  A simple method to predict protein flexibility using secondary chemical shifts. , 2005, Journal of the American Chemical Society.

[37]  Haruki Nakamura,et al.  Data Deposition and Annotation at the Worldwide Protein Data Bank , 2009, Molecular biotechnology.

[38]  John D. Westbrook,et al.  EMDataBank unified data resource for 3DEM , 2013, Nucleic Acids Res..

[39]  Oleg V. Tsodikov,et al.  Data publication with the structural biology data grid supports live analysis , 2016, Nature Communications.

[40]  Peter Güntert,et al.  Objective identification of residue ranges for the superposition of protein structures , 2011, BMC Bioinformatics.

[41]  Jennifer E. Padilla,et al.  A statistic for local intensity differences: robustness to anisotropy and pseudo-centering and utility for detecting twinning. , 2003, Acta crystallographica. Section D, Biological crystallography.

[42]  J. Zou,et al.  Improved methods for building protein models in electron density maps and the location of errors in these models. , 1991, Acta crystallographica. Section A, Foundations of crystallography.

[43]  Chenghua Shao,et al.  Crystallographic Analysis of Calcium-dependent Heparin Binding to Annexin A2* , 2006, Journal of Biological Chemistry.

[44]  Wladek Minor,et al.  A public database of macromolecular diffraction experiments. , 2016, Acta Crystallographica Section D: Structural Biology.

[45]  G J Kleywegt,et al.  Where freedom is given, liberties are taken. , 1995, Structure.

[46]  Vincent B. Chen,et al.  Correspondence e-mail: , 2000 .