Data mining-aided materials discovery and optimization

Abstract Recent developments in data mining-aided materials discovery and optimization are reviewed in this paper, and an introduction to the materials data mining (MDM) process is provided using case studies. Both qualitative and quantitative methods in machine learning can be adopted in the MDM process to accomplish different tasks in materials discovery, design, and optimization. State-of-the-art techniques in data mining-aided materials discovery and optimization are demonstrated by reviewing the controllable synthesis of dendritic Co 3 O 4 superstructures, materials design of layered double hydroxide, battery materials discovery, and thermoelectric materials design. The results of the case studies indicate that MDM is a powerful approach for use in materials discovery and innovation, and will play an important role in the development of the Materials Genome Initiative and Materials Informatics.

[1]  David J. Singh,et al.  BoltzTraP. A code for calculating band-structure dependent quantities , 2006, Comput. Phys. Commun..

[2]  Gerbrand Ceder,et al.  Opportunities and challenges for first-principles materials design and applications to Li battery materials , 2010 .

[3]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Siqi Shi,et al.  Multi-scale computation methods: Their applications in lithium-ion battery research and development , 2016 .

[5]  Gerbrand Ceder,et al.  Predicting crystal structure by merging data mining with quantum mechanics , 2006, Nature materials.

[6]  Ruijuan Xiao,et al.  Candidate structures for inorganic lithium solid-state electrolytes identified by high-throughput bond-valence calculations , 2015 .

[7]  Lotfi A. Zadeh,et al.  Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic , 1997, Fuzzy Sets Syst..

[8]  Jihui Yang,et al.  Evaluation of Half‐Heusler Compounds as Thermoelectric Materials Based on the Calculated Electrical Transport Properties , 2008 .

[9]  Jungho Im,et al.  Support vector machines in remote sensing: A review , 2011 .

[10]  Anubhav Jain,et al.  Effective mass and Fermi surface complexity factor from ab initio band structure calculations , 2017, npj Computational Materials.

[11]  Stefan Van Aelst,et al.  Machine Learning and Robust Data Mining , 2007, Comput. Stat. Data Anal..

[12]  Liang Liu,et al.  Using support vector machine for materials design , 2013 .

[13]  Paul Raccuglia,et al.  Machine-learning-assisted materials discovery using failed experiments , 2016, Nature.

[14]  Anubhav Jain,et al.  Novel mixed polyanions lithium-ion battery cathode materials predicted by high-throughput ab initio computations , 2011 .

[15]  Anubhav Jain,et al.  YCuTe2: a member of a new class of thermoelectric materials with CuTe4-based layered structure , 2016 .

[16]  G. R. Rao,et al.  Effect of Microwave on the Nanowire Morphology, Optical, Magnetic, and Pseudocapacitance Behavior of Co3O4 , 2011 .

[17]  Liquan Chen,et al.  Physics towards next generation Li secondary batteries materials: A short review from computational materials design perspective , 2013 .

[18]  Tiejun Zhu,et al.  Band engineering of high performance p-type FeNbSb based half-Heusler thermoelectric materials for figure of merit zT > 1 , 2015 .

[19]  Ruijuan Xiao,et al.  Quantitative structure-property relationship study of cathode volume changes in lithium ion batteries using ab-initio and partial least squares analysis , 2017 .

[20]  L. Zhang,et al.  Shape-Controlled Synthesis and Pattern Recognition of Dendritic Co3O4 Superstructures , 2013 .

[21]  Zonghai Chen,et al.  A generalized method for high throughput in-situ experiment data analysis: An example of battery materials exploration , 2015 .

[22]  David J. Singh,et al.  On the tuning of electrical and thermal transport in thermoelectrics: an integrated theory–experiment perspective , 2016 .

[23]  C. Zhi,et al.  Cobalt(II,III) oxide hollow structures: fabrication, properties and applications , 2012 .

[24]  Marco Buongiorno Nardelli,et al.  High-throughput computational screening of thermal conductivity, Debye temperature, and Grüneisen parameter using a quasiharmonic Debye model , 2014, 1407.7789.

[25]  Raynald Gauvin,et al.  Application of machine learning methods for the prediction of crystal system of cathode materials in lithium-ion batteries , 2016 .

[26]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[27]  Qin Pei,et al.  Chemometric methods applied to industrial optimization and materials optimal design , 1999 .

[28]  K. Fujimura,et al.  Accelerated Materials Design of Lithium Superionic Conductors Based on First‐Principles Calculations and Machine Learning Algorithms , 2013 .

[29]  S. Suib,et al.  Removal of Azo Dyes: Intercalation into Sonochemically Synthesized NiAl Layered Double Hydroxide , 2014 .

[30]  Vladan Stevanović,et al.  Material descriptors for predicting thermoelectric performance , 2015 .

[31]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[32]  Michael E. Tipping Sparse Bayesian Learning and the Relevance Vector Machine , 2001, J. Mach. Learn. Res..

[33]  Christopher M Wolverton,et al.  High‐Throughput Computational Screening of New Li‐Ion Battery Anode Materials , 2013 .

[34]  Jihui Yang,et al.  High‐Performance Pseudocubic Thermoelectric Materials from Non‐cubic Chalcopyrite Compounds , 2014, Advanced materials.

[35]  Anubhav Jain,et al.  Phosphates as Lithium-Ion Battery Cathodes: An Evaluation Based on High-Throughput ab Initio Calculations , 2011 .

[36]  Timothy L. Andersen,et al.  GAMPMS: Genetic algorithm managed peptide mutant screening , 2015, J. Comput. Chem..

[37]  Mayumi Kimura,et al.  Informatics-Aided Density Functional Theory Study on the Li Ion Transport of Tavorite-Type LiMTO4F (M3+-T5+, M2+-T6+) , 2015, J. Chem. Inf. Model..

[38]  Anuj Kumar Goyal,et al.  Capturing Anharmonicity in a Lattice Thermal Conductivity Model for High-Throughput Predictions , 2017 .

[39]  Stefano Curtarolo,et al.  Finding Unprecedentedly Low-Thermal-Conductivity Half-Heusler Semiconductors via High-Throughput Materials Modeling , 2014, 1401.2439.

[40]  S. Adams,et al.  Pathway models for fast ion conductors by combination of bond valence and reverse Monte Carlo methods , 2002 .

[41]  Anubhav Jain,et al.  Computational and experimental investigation of TmAgTe2 and XYZ2 compounds, a new group of thermoelectric materials identified by first-principles high-throughput screening , 2015 .

[42]  Wencong Lu,et al.  Using support vector regression for the prediction of the band gap and melting point of binary and ternary compound semiconductors , 2006 .

[43]  G. Madsen,et al.  Automated search for new thermoelectric materials: the case of LiZnSb. , 2006, Journal of the American Chemical Society.

[44]  Liquan Chen,et al.  Oxygen-driven transition from two-dimensional to three-dimensional transport behaviour in β-Li3PS4 electrolyte. , 2016, Physical chemistry chemical physics : PCCP.

[45]  James Theiler,et al.  Accelerated search for materials with targeted properties by adaptive design , 2016, Nature Communications.

[46]  Theofanis Sapatinas,et al.  Discriminant Analysis and Statistical Pattern Recognition , 2005 .

[47]  Yu Ren,et al.  Ordered mesoporous metal oxides: synthesis and applications. , 2012, Chemical Society reviews.

[48]  Stefano Curtarolo,et al.  Assessing the Thermoelectric Properties of Sintered Compounds via High-Throughput Ab-Initio Calculations , 2011 .

[49]  Jihui Yang,et al.  Electrical Transport Properties of Filled CoSb3 Skutterudites: A Theoretical Study , 2009 .

[50]  Yukinori Koyama,et al.  Accelerated discovery of cathode materials with prolonged cycle life for lithium-ion battery , 2014, Nature Communications.

[51]  Francesco Ciucci,et al.  Data mining of molecular dynamics data reveals Li diffusion characteristics in garnet Li7La3Zr2O12 , 2017, Scientific Reports.

[52]  Hong Li,et al.  High-throughput design and optimization of fast lithium ion conductors by the combination of bond-valence method and density functional theory , 2015, Scientific Reports.

[53]  Denis Pasero,et al.  High throughput methodology for synthesis, screening, and optimization of solid state lithium ion electrolytes. , 2011, ACS combinatorial science.

[54]  Wencong Lu,et al.  Prediction and synthesis of novel layered double hydroxide with desired basal spacing based on relevance vector machine , 2017 .

[55]  Mathew D. Halls,et al.  High-throughput quantum chemistry and virtual screening for lithium ion battery electrolyte additives , 2010 .

[56]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.