Machine-learned and codified synthesis parameters of oxide materials

Predictive materials design has rapidly accelerated in recent years with the advent of large-scale resources, such as materials structure and property databases generated by ab initio computations. In the absence of analogous ab initio frameworks for materials synthesis, high-throughput and machine learning techniques have recently been harnessed to generate synthesis strategies for select materials of interest. Still, a community-accessible, autonomously-compiled synthesis planning resource which spans across materials systems has not yet been developed. In this work, we present a collection of aggregated synthesis parameters computed using the text contained within over 640,000 journal articles using state-of-the-art natural language processing and machine learning techniques. We provide a dataset of synthesis parameters, compiled autonomously across 30 different oxide systems, in a format optimized for planning novel syntheses of materials.

[1]  Taylor D. Sparks,et al.  Performance and resource considerations of Li-ion battery electrode materials , 2015 .

[2]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[3]  Jari Björne,et al.  U-Compare bio-event meta-service: compatible BioNLP event extraction services , 2011, BMC Bioinformatics.

[4]  John D. Perkins,et al.  Strategy for the maximum extraction of information generated from combinatorial experimentation of Co-doped ZnO thin films , 2011 .

[5]  Anubhav Jain,et al.  Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis , 2012 .

[6]  Richard J Ingham,et al.  Organic synthesis: march of the machines. , 2015, Angewandte Chemie.

[7]  Callum Court,et al.  ChemDataExtractor: A toolkit for automated extraction of chemical information from the scientific literature , 2017 .

[8]  Mike Preuss,et al.  Towards "AlphaChem": Chemical Synthesis Planning with Tree Search and Deep Neural Network Policies , 2017, ICLR.

[9]  Peter Murray-Rust,et al.  ChemicalTagger: A tool for semantic text-mining in chemistry , 2011, J. Cheminformatics.

[10]  Thomas E. Potok,et al.  A bridge for accelerating materials by design , 2015 .

[11]  Mark Johnson,et al.  An Improved Non-monotonic Transition System for Dependency Parsing , 2015, EMNLP.

[12]  Martin Jansen,et al.  The energy landscape concept and its implications for synthesis planning , 2014 .

[13]  Christian Catalini,et al.  The incidence and role of negative citations in science , 2015, Proceedings of the National Academy of Sciences.

[14]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[15]  Marco Buongiorno Nardelli,et al.  The high-throughput highway to computational materials design. , 2013, Nature materials.

[16]  Edward O. Pyzer-Knapp,et al.  Learning from the Harvard Clean Energy Project: The Use of Neural Networks to Accelerate Materials Discovery , 2015 .

[17]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[18]  Jonathan Goodman,et al.  Computer Software Review: Reaxys , 2009, J. Chem. Inf. Model..

[19]  Martin Jansen,et al.  Conceptual Inorganic Materials Discovery – A Road Map , 2015, Advanced materials.

[20]  Avelino Corma,et al.  Titania supported gold nanoparticles as photocatalyst. , 2011, Physical chemistry chemical physics : PCCP.

[21]  Lavanya Ramakrishnan,et al.  Community Accessible Datastore of High-Throughput Calculations: Experiences from the Materials Project , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[22]  Piotr Dittwald,et al.  Computer-Assisted Synthetic Planning: The End of the Beginning. , 2016, Angewandte Chemie.

[23]  Paul Raccuglia,et al.  Machine-learning-assisted materials discovery using failed experiments , 2016, Nature.

[24]  Charles H. Ward Materials Genome Initiative for Global Competitiveness , 2012 .

[25]  Ichiro Takeuchi,et al.  Applications of high throughput (combinatorial) methodologies to electronic, magnetic, optical, and energy-related materials , 2013 .

[26]  Rachael Lammey CrossRef text and data mining services , 2015 .

[27]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[28]  Muratahan Aykol,et al.  Materials Design and Discovery with High-Throughput Density Functional Theory: The Open Quantum Materials Database (OQMD) , 2013 .

[29]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[30]  Krishna Rajan,et al.  Combinatorial and high-throughput screening of materials libraries: review of state of the art. , 2011, ACS combinatorial science.

[31]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[32]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[33]  Kristin A. Persson,et al.  Commentary: The Materials Project: A materials genome approach to accelerating materials innovation , 2013 .

[34]  Ulf Leser,et al.  ChemSpot: a hybrid system for chemical named entity recognition , 2012, Bioinform..

[35]  Zhenguo Yang,et al.  Polymorphic transformation and powder characteristics of TiO2 during high energy milling , 2000 .

[36]  J. Pablo,et al.  The Materials Genome Initiative, the interplay of experiment, theory and computation , 2014 .