A Community Contribution Framework for Sharing Materials Data with Materials Project

As scientific discovery becomes increasingly data-driven, software platforms are needed to efficiently organize and disseminate data from disparate sources. This is certainly the case in the field of materials science. For example, Materials Project has generated computational data on over 60,000 chemical compounds and has made that data available through a Web portal and REST interface. However, such portals must seek to incorporate community submissions to expand the scope of scientific data sharing. In this paper, we describe MPContribs, a computing/software infrastructure to integrate and organize contributions of simulated or measured materials data from users. Our solution supports complex submissions and provides interfaces that allow contributors to share analyses and graphs. A RESTful API exposes mechanisms for book-keeping, retrieval and aggregation of submitted entries, as well as persistent URIs or DOIs that can be used to reference the data in publications. Our approach isolates contributed data from a host project's quality-controlled core data and yet enables analyses across the entire dataset, programmatically or through customized web apps. We expect the developed framework to enhance collaborative determination of material properties and to maximize the impact of each contributor's dataset. In the long-term, MPContribs seeks to make Materials Project an institutional, and thus community-wide, memory for computational and experimental materials science.

[1]  Andrea Widener,et al.  Materials Genome Initiative , 2014 .

[2]  J. Rehr,et al.  Parameter-free calculations of X-ray spectra with FEFF9. , 2010, Physical chemistry chemical physics : PCCP.

[3]  Stephen R. Heller,et al.  InChI - the worldwide chemical structure identifier standard , 2013, Journal of Cheminformatics.

[4]  M. Klintenberg,et al.  Data mining and accelerated electronic structure theory as a tool in the search for new functional materials , 2008, 0808.2125.

[5]  Anubhav Jain,et al.  Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis , 2012 .

[6]  P. Luksch,et al.  New developments in the Inorganic Crystal Structure Database (ICSD): accessibility in support of materials research and design. , 2002, Acta crystallographica. Section B, Structural science.

[7]  Kristin A. Persson,et al.  Commentary: The Materials Project: A materials genome approach to accelerating materials innovation , 2013 .

[8]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[9]  Patrick Huck,et al.  User applications driven by the community contribution framework MPContribs in the Materials Project , 2015, Concurr. Comput. Pract. Exp..

[10]  Brian E. Granger,et al.  IPython: A System for Interactive Scientific Computing , 2007, Computing in Science & Engineering.

[11]  Lei Cheng,et al.  The Electrolyte Genome project: A big data approach in battery materials discovery , 2015 .

[12]  Feliu Maseras,et al.  Managing the Computational Chemistry Big Data Problem: The ioChem-BD Platform , 2015, J. Chem. Inf. Model..