Predicting the thermodynamic stability of perovskite oxides using machine learning models

Abstract Perovskite materials have become ubiquitous in many technologically relevant applications, ranging from catalysts in solid oxide fuel cells to light absorbing layers in solar photovoltaics. The thermodynamic phase stability is a key parameter that broadly governs whether the material is expected to be synthesizable, and whether it may degrade under certain operating conditions. Phase stability can be calculated using Density Functional Theory (DFT), but the significant computational cost makes such calculation potentially prohibitive when screening large numbers of possible compounds. In this work, we developed machine learning models to predict the thermodynamic phase stability of perovskite oxides using a dataset of more than 1900 DFT-calculated perovskite oxide energies. The phase stability was determined using convex hull analysis, with the energy above the convex hull (Ehull) providing a direct measure of the stability. We generated a set of 791 features based on elemental property data to correlate with the Ehull value of each perovskite compound, and found through feature selection that the top 70 features were sufficient to produce the most accurate models without significant overfitting. For classification, the extra trees algorithm achieved the best prediction accuracy of 0.93 (±0.02), with an F1 score of 0.88 (±0.03). For regression, leave-out 20% cross-validation tests with kernel ridge regression achieved the minimal root mean square error (RMSE) of 28.5 (±7.5) meV/atom between cross-validation predicted Ehull values and DFT calculations, with the mean absolute error (MAE) in cross-validation energies of 16.7 (±2.3) meV/atom. This error is within the range of errors in DFT formation energies relative to elemental reference states when compared to experiments and therefore may be considered sufficiently accurate to use in place of full DFT calculations. We further validated our model by predicting the stability of compounds not present in the training set and demonstrated our machine learning models are a fast and effective means of obtaining qualitatively useful guidance for a wide-range of perovskite oxide stability, potentially impacting materials design choices in a variety of technological applications.

[1]  Miguel A. L. Marques,et al.  Predicting the Thermodynamic Stability of Solids Combining Density Functional Theory and Machine Learning , 2017 .

[2]  Chonghe Li,et al.  Formability of ABO3 perovskites , 2004 .

[3]  Alok Choudhary,et al.  Combinatorial screening for new materials in unconstrained composition space with machine learning , 2014 .

[4]  Kristin A. Persson,et al.  First principles high throughput screening of oxynitrides for water-splitting photocatalysts , 2013 .

[5]  Wei Li,et al.  Data and Supplemental information for predicting the thermodynamic stability of perovskite oxides using machine learning models , 2018, Data in brief.

[6]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[7]  Alok Choudhary,et al.  A General-Purpose Machine Learning Framework for Predicting Properties of Inorganic Materials , 2016 .

[8]  S. Haile Fuel cell materials and components , 2003 .

[9]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[10]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[11]  Christopher M Wolverton,et al.  High-Throughput Computational Screening of Perovskites for Thermochemical Water Splitting Applications , 2016 .

[12]  Thomas Olsen,et al.  Computational screening of perovskite metal oxides for optimal solar light capture , 2012 .

[13]  V. M. Goldschmidt,et al.  Die Gesetze der Krystallochemie , 1926, Naturwissenschaften.

[14]  Klaus-Robert Müller,et al.  Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies. , 2013, Journal of chemical theory and computation.

[15]  Stefano Curtarolo,et al.  Assessing the Thermoelectric Properties of Sintered Compounds via High-Throughput Ab-Initio Calculations , 2011 .

[16]  Kristof T. Schütt,et al.  How to represent crystal structures for machine learning: Towards fast prediction of electronic properties , 2013, 1307.1266.

[17]  D. Morgan,et al.  Material Discovery and Design Principles for Stable, High Activity Perovskite Cathodes for Solid Oxide Fuel Cells , 2018, 1801.06109.

[18]  T. Ishihara Perovskite Oxide for Solid Oxide Fuel Cells , 2009 .

[19]  Thomas F. Jaramillo,et al.  New cubic perovskites for one- and two-photon water splitting using the computational materials repository , 2012 .

[20]  Rodney X. Sturdivant,et al.  Applied Logistic Regression: Hosmer/Applied Logistic Regression , 2005 .

[21]  Gerbrand Ceder,et al.  High-throughput screening of perovskite alloys for piezoelectric performance and thermodynamic stability , 2013, 1309.1727.

[22]  Felix A Faber,et al.  Machine Learning Energies of 2 Million Elpasolite (ABC_{2}D_{6}) Crystals. , 2015, Physical review letters.

[23]  Allan J. Jacobson,et al.  Materials for Solid Oxide Fuel Cells , 2010 .

[24]  J. Nørskov,et al.  Computational high-throughput screening of electrocatalytic materials for hydrogen evolution , 2006, Nature materials.

[25]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[26]  J. Goo,et al.  Receiver Operating Characteristic (ROC) Curve: Practical Review for Radiologists , 2004, Korean journal of radiology.

[27]  Muratahan Aykol,et al.  Materials Design and Discovery with High-Throughput Density Functional Theory: The Open Quantum Materials Database (OQMD) , 2013 .

[28]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[29]  石原 達己,et al.  Perovskite oxide for solid oxide fuel cells , 2009 .

[30]  S. Ong,et al.  The thermodynamic scale of inorganic crystalline metastability , 2016, Science Advances.

[31]  P. V. Coveney,et al.  Prediction of the functional properties of ceramic materials from composition using artificial neural networks , 2007 .

[32]  Taylor D. Sparks,et al.  Data-Driven Review of Thermoelectric Materials: Performance and Resource Considerations , 2013 .

[33]  Rahul Malik,et al.  Spinel compounds as multivalent battery cathodes: A systematic evaluation based on ab initio calculations , 2014 .

[34]  M. Rupp,et al.  Machine learning of molecular electronic properties in chemical compound space , 2013, 1305.7074.

[35]  A S Bondarenko,et al.  Alloys of platinum and early transition metals as oxygen reduction electrocatalysts. , 2009, Nature chemistry.

[36]  R. D. Shannon Revised effective ionic radii and systematic studies of interatomic distances in halides and chalcogenides , 1976 .

[37]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[38]  Stefano Curtarolo,et al.  Finding Unprecedentedly Low-Thermal-Conductivity Half-Heusler Semiconductors via High-Throughput Materials Modeling , 2014, 1401.2439.

[39]  Kristin A. Persson,et al.  Commentary: The Materials Project: A materials genome approach to accelerating materials innovation , 2013 .

[40]  Engineering,et al.  Prediction model of band gap for inorganic compounds by combination of density functional theory calculations and machine learning techniques , 2016 .

[41]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.