Inverse Design of Solid-State Materials via a Continuous Representation

Summary The non-serendipitous discovery of materials with targeted properties is the ultimate goal of materials research, but to date, materials design lacks the incorporation of all available knowledge to plan the synthesis of the next material. This work presents a framework for learning a continuous representation of materials and building a model for new discovery using latent space representation. The ability of autoencoders to generate experimental materials is demonstrated with vanadium oxides via rediscovery of experimentally known structures when the model was trained without them. Approximately 20,000 hypothetical materials are generated, leading to several completely new metastable VxOy materials that may be synthesizable. Comparison with genetic algorithms suggests computational efficiency of generative models that can explore chemical compositional space effectively by learning the distributions of known materials for crystal structure prediction. These results are an important step toward machine-learned inverse design of inorganic functional materials using generative models.

[1]  Taylor D. Sparks,et al.  High-Throughput Machine-Learning-Driven Synthesis of Full-Heusler Compounds , 2016 .

[2]  Kevin M. Ryan,et al.  Crystal Structure Prediction via Deep Learning. , 2018, Journal of the American Chemical Society.

[3]  Kristin A. Persson,et al.  Electrochemical Stability of Metastable Materials , 2017 .

[4]  S. Curtarolo,et al.  AFLOW: An automatic framework for high-throughput materials discovery , 2012, 1308.5715.

[5]  Vijay S. Pande,et al.  Molecular graph convolutions: moving beyond fingerprints , 2016, Journal of Computer-Aided Molecular Design.

[6]  Li Zhu,et al.  CALYPSO: A method for crystal structure prediction , 2012, Comput. Phys. Commun..

[7]  Kristof T. Schütt,et al.  How to represent crystal structures for machine learning: Towards fast prediction of electronic properties , 2013, 1307.1266.

[8]  Kristin A. Persson,et al.  Commentary: The Materials Project: A materials genome approach to accelerating materials innovation , 2013 .

[9]  T. Lookman,et al.  Accelerated Discovery of Large Electrostrains in BaTiO3‐Based Piezoelectrics Using Active Learning , 2018, Advanced materials.

[10]  Natalio Mingo,et al.  Materials Screening for the Discovery of New Half-Heuslers: Machine Learning versus ab Initio Methods. , 2017, The journal of physical chemistry. B.

[11]  Muratahan Aykol,et al.  The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies , 2015 .

[12]  Jeffrey C Grossman,et al.  Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties. , 2017, Physical review letters.

[13]  Tom White,et al.  Sampling Generative Networks: Notes on a Few Effective Techniques , 2016, ArXiv.

[14]  Yoyo Hinuma,et al.  Discovery of earth-abundant nitride semiconductors by computational screening and high-pressure synthesis , 2016, Nature Communications.

[15]  Felix A Faber,et al.  Crystal structure representations for machine learning models of formation energies , 2015, 1503.07406.

[16]  Nikolaus Hansen,et al.  USPEX - Evolutionary crystal structure prediction , 2006, Comput. Phys. Commun..

[17]  Edward O. Pyzer-Knapp,et al.  Learning from the Harvard Clean Energy Project: The Use of Neural Networks to Accelerate Materials Discovery , 2015 .

[18]  David C. Lonie,et al.  XtalOpt: An open-source evolutionary algorithm for crystal structure prediction , 2011, Comput. Phys. Commun..

[19]  Anubhav Jain,et al.  Finding Nature′s Missing Ternary Oxide Compounds Using Machine Learning and Density Functional Theory. , 2010 .

[20]  Kristian S Thygesen,et al.  Two-Dimensional Metal Dichalcogenides and Oxides for Hydrogen Evolution: A Computational Screening Approach. , 2015, The journal of physical chemistry letters.

[21]  Isaac Tamblyn,et al.  Convolutional neural networks for atomistic systems , 2017, Computational Materials Science.

[22]  P. Luksch,et al.  New developments in the Inorganic Crystal Structure Database (ICSD): accessibility in support of materials research and design. , 2002, Acta crystallographica. Section B, Structural science.

[23]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[24]  Kristin A Persson,et al.  Robust and synthesizable photocatalysts for CO2 reduction: a data-driven materials discovery , 2019, Nature Communications.

[25]  Seiji Kajita,et al.  A Universal 3D Voxel Descriptor for Solid-State Material Informatics with Deep Convolutional Neural Networks , 2017, Scientific Reports.

[26]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[27]  J. Galy,et al.  Ab initio structures of (M2) and (M3) VO2 high pressure phases , 1999 .

[28]  Jakoah Brgoch,et al.  Disentangling Structural Confusion through Machine Learning: Structure Prediction and Polymorphism of Equiatomic Ternary Phases ABC. , 2017, Journal of the American Chemical Society.

[29]  George Papadatos,et al.  The ChEMBL database in 2017 , 2016, Nucleic Acids Res..

[30]  Lorenz C. Blum,et al.  970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. , 2009, Journal of the American Chemical Society.

[31]  Geun Ho Gu,et al.  Machine learning for renewable energy materials , 2019, Journal of Materials Chemistry A.

[32]  Shou-Cheng Zhang,et al.  Learning atoms for materials discovery , 2018, Proceedings of the National Academy of Sciences.

[33]  Alán Aspuru-Guzik,et al.  Optimizing distributions over molecular space. An Objective-Reinforced Generative Adversarial Network for Inverse-design Chemistry (ORGANIC) , 2017 .

[34]  Alán Aspuru-Guzik,et al.  Reinforced Adversarial Neural Computer for de Novo Molecular Design , 2018, J. Chem. Inf. Model..

[35]  Gerbrand Ceder,et al.  Thermodynamic Routes to Novel Metastable Nitrogen-Rich Nitrides , 2017 .

[36]  Minghui Yang,et al.  Experimental Synthesis and Properties of Metastable CuNbN2 and Theoretical Extension to Other Ternary Copper Nitrides , 2014 .

[37]  Sergey Nikolenko,et al.  druGAN: An Advanced Generative Adversarial Autoencoder Model for de Novo Generation of New Molecules with Desired Molecular Properties in Silico. , 2017, Molecular pharmaceutics.

[38]  Gerbrand Ceder,et al.  Predicting crystal structure by merging data mining with quantum mechanics , 2006, Nature materials.

[39]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[40]  Alireza Khorshidi,et al.  Amp: A modular approach to machine learning in atomistic simulations , 2016, Comput. Phys. Commun..

[41]  Alán Aspuru-Guzik,et al.  Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models , 2017, ArXiv.

[42]  Maciej Haranczyk,et al.  Assessing Local Structure Motifs Using Order Parameters for Motif Recognition, Interstitial Identification, and Diffusion Path Characterization , 2017, Front. Mater..

[43]  Gerbrand Ceder,et al.  Efficient and accurate machine-learning interpolation of atomic energies in compositions with many species , 2017, 1706.06293.

[44]  Rudolf Allmann,et al.  The introduction of structure types into the Inorganic Crystal Structure Database ICSD , 2007, Acta crystallographica. Section A, Foundations of crystallography.

[45]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[46]  Bryce Meredig,et al.  Data mining our way to the next generation of thermoelectrics , 2016 .

[47]  Aron Walsh,et al.  Computer-aided design of metal chalcohalide semiconductors: from chemical composition to crystal structure† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7sc03961a , 2017, Chemical science.

[48]  Gerbrand Ceder,et al.  A map of the inorganic ternary metal nitrides , 2018, Nature Materials.

[49]  B. Chamberland,et al.  New defect vanadium dioxide phases , 1973 .

[50]  Michele Parrinello,et al.  Generalized neural-network representation of high-dimensional potential-energy surfaces. , 2007, Physical review letters.

[51]  Kristin A. Persson,et al.  Discovery of Manganese-Based Solar Fuel Photoanodes via Integration of Electronic Structure Calculations, Pourbaix Stability Modeling, and High-Throughput Experiments , 2017 .

[52]  Gianni De Fabritiis,et al.  KDEEP: Protein-Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks , 2018, J. Chem. Inf. Model..

[53]  Brian K. Shoichet,et al.  ZINC - A Free Database of Commercially Available Compounds for Virtual Screening , 2005, J. Chem. Inf. Model..

[54]  Alán Aspuru-Guzik,et al.  Inverse molecular design using machine learning: Generative models for matter engineering , 2018, Science.

[55]  Jean-Louis Reymond,et al.  Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database GDB-17 , 2012, J. Chem. Inf. Model..