Harnessing the Materials Project for machine-learning and accelerated discovery

Author(s): Ye, W; Chen, C; Dwaraknath, S; Jain, A; Ong, SP; Persson, KA | Abstract: © Copyright Materials Research Society 2018. Improvements in computational resources over the last decade are enabling a new era of computational prediction and design of novel materials. The resulting resources are databases such as the Materials Project (www.materialsproject.org), which is harnessing the power of supercomputing together with state-of-the-art quantum mechanical theory to compute the properties of all known inorganic materials, to design novel materials, and to make the data available for free to the community, together with online analysis and design algorithms. The current release contains data derived from quantum mechanical calculations for more than 70,000 materials and millions of associated materials properties. The software infrastructure carries out thousands of calculations per week, enabling screening and predictions for both novel solids as well as molecular species with targeted properties. As the rapid growth of accessible computed materials properties continues, the next frontier is harnessing that information for automated learning and accelerated discovery. In this article, we highlight some of the emerging and exciting efforts, and successes, as well as current challenges using descriptor-based and machine-learning methods for data-accelerated materials design.

[1]  Mark Asta,et al.  A database to enable discovery and design of piezoelectric materials , 2015, Scientific Data.

[2]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[3]  Anubhav Jain,et al.  Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis , 2012 .

[4]  Cormac Toher,et al.  Charting the complete elastic properties of inorganic crystalline compounds , 2015, Scientific Data.

[5]  I. D. Brown,et al.  The inorganic crystal structure data base , 1983, J. Chem. Inf. Comput. Sci..

[6]  G. J. Snyder,et al.  Complex thermoelectric materials. , 2008, Nature materials.

[7]  Wei Chen,et al.  FireWorks: a dynamic workflow system designed for high‐throughput applications , 2015, Concurr. Comput. Pract. Exp..

[8]  Shyue Ping Ong,et al.  Accurate Force Field for Molybdenum by Machine Learning Large Materials Data , 2017, 1706.09122.

[9]  Bryce Meredig,et al.  Robust FCC solute diffusion predictions from ab-initio machine learning methods , 2017, 1705.08798.

[10]  Gian-Marco Rignanese,et al.  High-throughput density-functional perturbation theory phonons for inorganic materials , 2018, Scientific data.

[11]  Muratahan Aykol,et al.  The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies , 2015 .

[12]  Taylor D. Sparks,et al.  Perspective: Web-based machine learning models for real-time screening of thermoelectric materials properties , 2016 .

[13]  Jeffrey C Grossman,et al.  Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties. , 2017, Physical review letters.

[14]  J. Vybíral,et al.  Big data of materials science: critical role of the descriptor. , 2014, Physical review letters.

[15]  S. Pugh XCII. Relations between the elastic moduli and the plastic properties of polycrystalline pure metals , 1954 .

[16]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[17]  Matthew Horton,et al.  Atomate: A high-level interface to generate, execute, and analyze computational materials science workflows , 2017 .

[18]  G. Pilania,et al.  Machine learning bandgaps of double perovskites , 2016, Scientific Reports.

[19]  Cormac Toher,et al.  Universal fragment descriptors for predicting properties of inorganic crystals , 2016, Nature Communications.

[20]  Gerbrand Ceder,et al.  Oxidation energies of transition metal oxides within the GGA+U framework , 2006 .

[21]  Kristin A. Persson,et al.  Commentary: The Materials Project: A materials genome approach to accelerating materials innovation , 2013 .

[22]  S. Ong,et al.  The thermodynamic scale of inorganic crystalline metastability , 2016, Science Advances.

[23]  Wei Chen,et al.  A Statistical Learning Framework for Materials Science: Application to Elastic Moduli of k-nary Inorganic Polycrystalline Compounds , 2016, Scientific Reports.

[24]  Anubhav Jain,et al.  Accuracy of density functional theory in predicting formation energies of ternary oxides from binary oxides and its implication on phase stability , 2012 .

[25]  Corey Oses,et al.  Materials Cartography: Representing and Mining Material Space Using Structural and Electronic Fingerprints , 2014, 1412.4096.

[26]  Marco Buongiorno Nardelli,et al.  AFLOWLIB.ORG: A distributed materials properties repository from high-throughput ab initio calculations , 2012 .

[27]  Wei Chen,et al.  Predicting defect behavior in B2 intermetallics by merging ab initio modeling and machine learning , 2016, npj Computational Materials.

[28]  P. Blaha,et al.  Accurate band gaps of semiconductors and insulators with a semilocal exchange-correlation potential. , 2009, Physical review letters.

[29]  Hanmei Tang,et al.  Automated generation and ensemble-learned matching of X-ray absorption spectra , 2017, npj Computational Materials.

[30]  D. Hesp,et al.  Cu(110)表面状態に及ぼすステップと規則的欠陥の影響 , 2013 .

[31]  Fei Yuan,et al.  Chemical Descriptors Are More Important Than Learning Algorithms for Modelling , 2012, Molecular informatics.

[32]  Anubhav Jain,et al.  The Materials Application Programming Interface (API): A simple, flexible and efficient API for materials data based on REpresentational State Transfer (REST) principles , 2015 .

[33]  Felix A Faber,et al.  Crystal structure representations for machine learning models of formation energies , 2015, 1503.07406.

[34]  Miguel A. L. Marques,et al.  Predicting the Thermodynamic Stability of Solids Combining Density Functional Theory and Machine Learning , 2017 .

[35]  G. Ceder,et al.  Efficient band gap prediction for solids. , 2010, Physical review letters.

[36]  Atsuto Seko,et al.  Machine learning with systematic density-functional theory calculations: Application to melting temperatures of single- and binary-component solids , 2013, 1310.1546.

[37]  Engineering,et al.  Prediction model of band gap for inorganic compounds by combination of density functional theory calculations and machine learning techniques , 2016 .

[38]  Wei Chen,et al.  An ab initio electronic transport database for inorganic materials , 2017, Scientific Data.

[39]  Maciej Haranczyk,et al.  Assessing Local Structure Motifs Using Order Parameters for Motif Recognition, Interstitial Identification, and Diffusion Path Characterization , 2017, Front. Mater..

[40]  Kiyoyuki Terakura,et al.  Machine learning reveals orbital interaction in materials , 2017, Science and technology of advanced materials.

[41]  M. Shishkin,et al.  Quasiparticle band structure based on a generalized Kohn-Sham scheme , 2007 .

[42]  Alok Choudhary,et al.  A General-Purpose Machine Learning Framework for Predicting Properties of Inorganic Materials , 2016 .

[43]  Michele Parrinello,et al.  Generalized neural-network representation of high-dimensional potential-energy surfaces. , 2007, Physical review letters.

[44]  G. Scuseria,et al.  Hybrid functionals based on a screened Coulomb potential , 2003 .

[45]  Bryce Meredig,et al.  A recommendation engine for suggesting unexpected thermoelectric chemistries , 2015, 1502.07635.

[46]  Chiho Kim,et al.  Machine learning in materials informatics: recent applications and prospects , 2017, npj Computational Materials.

[47]  Wei Chen,et al.  High-throughput screening of inorganic compounds for the discovery of novel dielectric and optical materials , 2017, Scientific Data.

[48]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[49]  Stefano Curtarolo,et al.  How the Chemical Composition Alone Can Predict Vibrational Free Energies and Entropies of Solids , 2017, 1703.02309.

[50]  S. Curtarolo,et al.  Nanograined Half‐Heusler Semiconductors as Advanced Thermoelectrics: An Ab Initio High‐Throughput Statistical Study , 2014, 1408.5859.

[51]  Atsuto Seko,et al.  Prediction of Low-Thermal-Conductivity Compounds with First-Principles Anharmonic Lattice-Dynamics Calculations and Bayesian Optimization. , 2015, Physical review letters.

[52]  R. Kondor,et al.  Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. , 2009, Physical review letters.