Crystallography and Databases

Crystallographic databases have existed as electronic resources for over 50 years, and have provided comprehensive archives of crystal structures of inorganic, organic, metal–organic and biological macromolecular compounds of immense value to a wide range of structural sciences. They thus serve a variety of scientific disciplines, but are all driven by considerations of accuracy, precise characterization, and potential for search, analysis and reuse. They also serve a variety of end-users in academia and industry, and have evolved through different funding and licensing models. The diversity of their operational mechanisms combined with their undisputed value as scientific research tools gives rise to a rich ecosystem. A session at SciDataCon2016 gave an overview of the largest extant crystallographic databases and their current activities and plans for the future. This review summarizes these presentations and considers them alongside other players in the field, demonstrating their variety, versatility and focus on quality and usefulness.

[1]  Hans Wondratschek,et al.  Bilbao Crystallographic Server: I. Databases and crystallographic computing programs , 2006 .

[2]  Peter Moeck,et al.  3D printed models of small and large molecules, structures and morphologies of crystals, as well as their anisotropic physical properties , 2015 .

[3]  David Groenewegen,et al.  Operation of the Australian Store.Synchrotron for macromolecular crystallography , 2014, Acta crystallographica. Section D, Biological crystallography.

[4]  Oleg V. Tsodikov,et al.  Data publication with the structural biology data grid supports live analysis , 2016, Nature Communications.

[5]  Peter Moeck,et al.  One-click preparation of 3D print files (*.stl, *.wrl) from *.cif (crystallographic information framework) data using Cif2VRML , 2014, Powder Diffraction.

[6]  J. Kaduk,et al.  The crystal structure of trandolapril, C24H34N2O5: an example of the utility of raw data deposition in the powder diffraction file , 2016, Powder Diffraction.

[7]  Roy T. Fielding,et al.  Uniform Resource Identifier (URI): Generic Syntax , 2005, RFC.

[8]  Haruki Nakamura,et al.  Announcing the worldwide Protein Data Bank , 2003, Nature Structural Biology.

[9]  Soorya N Kabekkodu,et al.  New Powder Diffraction File (PDF-4) in relational database format: advantages and data-mining capabilities. , 2002, Acta crystallographica. Section B, Structural science.

[10]  Tjelvar S. G. Olsson,et al.  Mining the Cambridge Structural Database for Bioisosteres , 2012 .

[11]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[12]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[13]  Daan Broeder,et al.  Building a Disciplinary, World‐Wide Data Infrastructure , 2017 .

[14]  Saulius Gražulis,et al.  Crystallographic education in the 21st century , 2015, Journal of applied crystallography.

[15]  Sydney Hall,et al.  The Implementation and Evolution of STAR/CIF Ontologies: Interoperability and Preservation of Structured Data , 2016, Data Sci. J..

[16]  Peter Moeck,et al.  Crystallography Open Database (COD): an open-access collection of crystal structures and platform for world-wide collaboration , 2011, Nucleic Acids Res..

[17]  Daniel Chateigner,et al.  MPOD: A Material Property Open Database linked to structural information , 2012 .

[18]  Robert M. Hanson,et al.  MAGNDATA: towards a database of magnetic structures. I. The commensurate case , 2016 .

[19]  Peter Murray-Rust,et al.  CrystalEye: automated aggregation, semantification and dissemination of the world's open crystallographic data , 2012 .

[20]  Antony J. Williams,et al.  Programmatic conversion of crystal structures into 3D printable files using Jmol , 2016, Journal of Cheminformatics.

[21]  J. Cole,et al.  The use of small-molecule structures to complement protein–ligand crystal structures in drug discovery , 2017, Acta crystallographica. Section D, Structural biology.

[22]  Armel Le Bail,et al.  Inorganic structure prediction with GRINSP , 2005 .

[23]  David S. Goodsell,et al.  The RCSB PDB “Molecule of the Month”: Inspiring a Molecular View of Biology , 2015, PLoS biology.

[24]  Randy J. Read,et al.  A New Generation of Crystallographic Validation Tools for the Protein Data Bank , 2011, Structure.

[25]  John Faber,et al.  The Powder Diffraction File: present and future. , 2002, Acta crystallographica. Section B, Structural science.

[26]  R. Downs,et al.  The American Mineralogist crystal structure database , 2003 .

[27]  F. Allen,et al.  The crystallographic information file (CIF) : a new standard archive file for crystallography , 1991 .

[28]  Saulius Gražulis,et al.  Specification of the Crystallographic Information File format, version 2.0 , 2016 .

[29]  Michael Pilato Version Control with Subversion , 2004 .

[30]  Armel Le Bail,et al.  Inorganic structure prediction with GRINSP , 2005 .

[31]  John D. Westbrook,et al.  The Nucleic Acid Database: new features and capabilities , 2013, Nucleic Acids Res..

[32]  J. Rodgers,et al.  CRYSTMET: a database of the structures and powder patterns of metals and intermetallics. , 2002, Acta crystallographica. Section B, Structural science.

[33]  Christodoulos A. Floudas,et al.  MOFomics: Computational pore characterization of metal–organic frameworks , 2013 .

[34]  Brian Warner,et al.  Tahoe: the least-authority filesystem , 2008, StorageSS '08.

[35]  P. Luksch,et al.  New developments in the Inorganic Crystal Structure Database (ICSD): accessibility in support of materials research and design. , 2002, Acta crystallographica. Section B, Structural science.

[36]  M. Baker,et al.  Outcome of the First Electron Microscopy Validation Task Force Meeting , 2012, Structure.

[37]  Stephen R. Heller,et al.  InChI, the IUPAC International Chemical Identifier , 2015, Journal of Cheminformatics.

[38]  Saulius Gražulis,et al.  Crystallography Open Database – an open-access collection of crystal structures , 2009, Journal of applied crystallography.

[39]  Haruki Nakamura,et al.  Outcome of the First wwPDB/CCDC/D3R Ligand Validation Workshop. , 2016, Structure.

[40]  I. Bruno,et al.  Cambridge Structural Database , 2002 .

[41]  Michelle Hall-Wallace,et al.  Building the American Mineralogist Crystal Structure Database: A recipe for construction of a small Internet database , 2006 .

[42]  John R Helliwell,et al.  Raw diffraction data preservation and reuse: overview, update on practicalities and metadata requirements , 2017, IUCrJ.

[43]  Wladek Minor,et al.  A public database of macromolecular diffraction experiments. , 2016, Acta Crystallographica Section D: Structural Biology.

[44]  Frank H. Allen,et al.  Navigating the Solid Form Landscape with Structural Informatics , 2016 .

[45]  Boris Kozinsky,et al.  AiiDA: Automated Interactive Infrastructure and Database for Computational Science , 2015, ArXiv.

[46]  G. Montelione,et al.  Recommendations of the wwPDB NMR Validation Task Force. , 2013, Structure.

[47]  E. F. CODD,et al.  A relational model of data for large shared data banks , 1970, CACM.

[48]  Haruki Nakamura,et al.  The Protein Data Bank at 40: reflecting on the past to prepare for the future. , 2012, Structure.