Identifying Pb-free perovskites for solar cells by machine learning

Recent advances in computing power have enabled the generation of large datasets for materials, enabling data-driven approaches to problem-solving in materials science, including materials discovery. Machine learning is a primary tool for manipulating such large datasets, predicting unknown material properties and uncovering relationships between structure and property. Among state-of-the-art machine learning algorithms, gradient-boosted regression trees (GBRT) are known to provide highly accurate predictions, as well as interpretable analysis based on the importance of features. Here, in a search for lead-free perovskites for use in solar cells, we applied the GBRT algorithm to a dataset of electronic structures for candidate halide double perovskites to predict heat of formation and bandgap. Statistical analysis of the selected features identifies design guidelines for the discovery of new lead-free perovskites.

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  Muratahan Aykol,et al.  The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies , 2015 .

[3]  Mark E Tuckerman,et al.  Stochastic Neural Network Approach for Learning High-Dimensional Free Energy Surfaces. , 2017, Physical review letters.

[4]  G. Tutz,et al.  An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. , 2009, Psychological methods.

[5]  Kresse,et al.  Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. , 1996, Physical review. B, Condensed matter.

[6]  V Kishore Ayyadevara,et al.  Gradient Boosting Machine , 2018 .

[7]  Takashi Miyake,et al.  Crystal structure prediction accelerated by Bayesian optimization , 2018 .

[8]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[9]  Peter G. Boyd,et al.  Computational development of the nanoporous materials genome , 2017 .

[10]  K. Jacobsen,et al.  Bandgap Engineering of Double Perovskites for One- and Two-photon Water Splitting , 2013 .

[11]  S. Ong,et al.  New opportunities for materials informatics: Resources and data mining techniques for uncovering hidden relationships , 2016 .

[12]  James E. Gubernatis,et al.  Structure classification and melting temperature prediction in octet AB solids via machine learning , 2015 .

[13]  Li Li,et al.  Bypassing the Kohn-Sham equations with machine learning , 2016, Nature Communications.

[14]  Gerbrand Ceder,et al.  Identification and design principles of low hole effective mass p-type transparent conducting oxides , 2013, Nature Communications.

[15]  Noam Bernstein,et al.  Machine learning unifies the modeling of materials and molecules , 2017, Science Advances.

[16]  J. Vybíral,et al.  Big data of materials science: critical role of the descriptor. , 2014, Physical review letters.

[17]  Steven Hobday,et al.  Applications of neural networks to fitting interatomic potential functions , 1999 .

[18]  H. Queisser,et al.  Detailed Balance Limit of Efficiency of p‐n Junction Solar Cells , 1961 .

[19]  Burke,et al.  Generalized Gradient Approximation Made Simple. , 1996, Physical review letters.

[20]  J. Conesa,et al.  Self-consistent relativistic band structure of the CH3NH3PbI3 perovskite , 2014 .

[21]  Kristof T. Schütt,et al.  How to represent crystal structures for machine learning: Towards fast prediction of electronic properties , 2013, 1307.1266.

[22]  Kristin A. Persson,et al.  Commentary: The Materials Project: A materials genome approach to accelerating materials innovation , 2013 .

[23]  Anubhav Jain,et al.  Accuracy of density functional theory in predicting formation energies of ternary oxides from binary oxides and its implication on phase stability , 2012 .

[24]  C. Hwang,et al.  Novel high-κ dielectrics for next-generation electronic devices screened by automated ab initio calculations , 2015 .

[25]  S. Curtarolo,et al.  AFLOW: An automatic framework for high-throughput materials discovery , 2012, 1308.5715.

[26]  Iosif I. Vaisman,et al.  Machine learning approach for structure-based zeolite classification , 2009 .

[27]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[28]  Thomas Bligaard,et al.  Density functionals for surface science: Exchange-correlation model development with Bayesian error estimation , 2012 .

[29]  Claudia Draxl,et al.  NOMAD: The FAIR concept for big data-driven materials science , 2018, MRS Bulletin.

[30]  G. Pilania,et al.  Machine learning bandgaps of double perovskites , 2016, Scientific Reports.

[31]  Mercouri G Kanatzidis,et al.  Semiconducting tin and lead iodide perovskites with organic cations: phase transitions, high mobilities, and near-infrared photoluminescent properties. , 2013, Inorganic chemistry.

[32]  A. Choudhary,et al.  Perspective: Materials informatics and big data: Realization of the “fourth paradigm” of science in materials science , 2016 .

[33]  Gerbrand Ceder,et al.  Predicting crystal structure by merging data mining with quantum mechanics , 2006, Nature materials.

[34]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  P. Umari,et al.  Relativistic Solar Cells , 2013, 1309.4895.

[36]  S. Curtarolo,et al.  Nanograined Half‐Heusler Semiconductors as Advanced Thermoelectrics: An Ab Initio High‐Throughput Statistical Study , 2014, 1408.5859.

[37]  Klaus-Robert Müller,et al.  Finding Density Functionals with Machine Learning , 2011, Physical review letters.

[38]  P. Popelier,et al.  Potential energy surfaces fitted by artificial neural networks. , 2010, The journal of physical chemistry. A.

[39]  Paolo Umari,et al.  Relativistic GW calculations on CH3NH3PbI3 and CH3NH3SnI3 Perovskites for Solar Cell Applications , 2014, Scientific Reports.

[40]  F. Giustino,et al.  Band Gaps of the Lead-Free Halide Double Perovskites Cs2BiAgCl6 and Cs2BiAgBr6 from Theory and Experiment. , 2016, The journal of physical chemistry letters.

[41]  Fenghua Zhang,et al.  Synthesis and Properties of a Lead-Free Hybrid Double Perovskite: (CH3NH3)2AgBiBr6 , 2017 .

[42]  Wei Chen,et al.  Efficient and stable large-area perovskite solar cells with inorganic charge extraction layers , 2015, Science.

[43]  Sanguthevar Rajasekaran,et al.  Accelerating materials property predictions using machine learning , 2013, Scientific Reports.

[44]  F. Giustino,et al.  Lead-Free Halide Double Perovskites via Heterovalent Substitution of Noble Metals. , 2016, The journal of physical chemistry letters.

[45]  Min Gyu Kim,et al.  Colloidally prepared La-doped BaSnO3 electrodes for efficient, photostable perovskite solar cells , 2017, Science.

[46]  Muratahan Aykol,et al.  Materials Design and Discovery with High-Throughput Density Functional Theory: The Open Quantum Materials Database (OQMD) , 2013 .

[47]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[48]  Surya R. Kalidindi,et al.  Role of materials data science and informatics in accelerated materials innovation , 2016 .

[49]  Engineering,et al.  Prediction model of band gap for inorganic compounds by combination of density functional theory calculations and machine learning techniques , 2016 .

[50]  Gerbrand Ceder,et al.  Screening for high-performance piezoelectrics using high-throughput density functional theory , 2011 .

[51]  Cs2InAgCl6: A New Lead-Free Halide Double Perovskite with Direct Band Gap. , 2016, The journal of physical chemistry letters.

[52]  G. Kresse,et al.  Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set , 1996 .

[53]  Blöchl,et al.  Projector augmented-wave method. , 1994, Physical review. B, Condensed matter.