Chemical annotation of small and peptide-like molecules at the Protein Data Bank

Over the past decade, the number of polymers and their complexes with small molecules in the Protein Data Bank archive (PDB) has continued to increase significantly. To support scientific advancements and ensure the best quality and completeness of the data files over the next 10 years and beyond, the Worldwide PDB partnership that manages the PDB archive is developing a new deposition and annotation system. This system focuses on efficient data capture across all supported experimental methods. The new deposition and annotation system is composed of four major modules that together support all of the processing requirements for a PDB entry. In this article, we describe one such module called the Chemical Component Annotation Tool. This tool uses information from both the Chemical Component Dictionary and Biologically Interesting molecule Reference Dictionary to aid in annotation. Benchmark studies have shown that the Chemical Component Annotation Tool provides significant improvements in processing efficiency and data quality. Database URL: http://wwpdb.org

[1]  Jens Sadowski,et al.  3D Structure Generator CORINA Generation of High-Quality Three-Dimensional Molecular Models , 2006 .

[2]  J. Silberg,et al.  A transposase strategy for creating libraries of circularly permuted proteins , 2012, Nucleic acids research.

[3]  Haruki Nakamura,et al.  Announcing the worldwide Protein Data Bank , 2003, Nature Structural Biology.

[4]  Sameer Velankar,et al.  PDBe: Protein Data Bank in Europe , 2009, Nucleic Acids Res..

[5]  Zukang Feng,et al.  The wwPDB common tool for deposition and annotation , 2011 .

[6]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[7]  F. Allen,et al.  The Cambridge Crystallographic Data Centre: computer-based search, retrieval, analysis and display of information , 1979 .

[8]  Stephen R. Heller,et al.  InChI - the worldwide chemical structure identifier standard , 2013, Journal of Cheminformatics.

[9]  G. Montelione,et al.  Recommendations of the wwPDB NMR Validation Task Force. , 2013, Structure.

[10]  Akira R. Kinjo,et al.  Protein Data Bank Japan (PDBj): maintaining a structural data archive and resource description framework format , 2011, Nucleic Acids Res..

[11]  J. Gasteiger,et al.  Automatic generation of 3D-atomic coordinates for organic molecules , 1990 .

[12]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[13]  Peter Ertl,et al.  Molecular structure input on the web , 2010, J. Cheminformatics.

[14]  Brian McMahon,et al.  Definition and exchange of crystallographic data , 2005 .

[15]  T. Hahn International tables for crystallography , 2002 .

[16]  S. R. Hall,et al.  International Tables for Crystallography: Definition and exchange of crystallographic data , 2006 .

[17]  M. Baker,et al.  Outcome of the First Electron Microscopy Validation Task Force Meeting , 2012, Structure.

[18]  The UniProt Consortium,et al.  Reorganizing the protein space at the Universal Protein Resource (UniProt) , 2011, Nucleic Acids Res..

[19]  Sameer Velankar,et al.  Implementing an X-ray validation pipeline for the Protein Data Bank , 2012, Acta crystallographica. Section D, Biological crystallography.

[20]  Randy J. Read,et al.  A New Generation of Crystallographic Validation Tools for the Protein Data Bank , 2011, Structure.

[21]  Mario Vento,et al.  A (sub)graph isomorphism algorithm for matching large graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[23]  Zukang Feng,et al.  Improving the representation of peptide-like inhibitor and antibiotic molecules in the Protein Data Bank , 2014, Biopolymers.