Analyzing machine learning models to accelerate generation of fundamental materials insights

Machine learning for materials science envisions the acceleration of basic science research through automated identification of key data relationships to augment human interpretation and gain scientific understanding. A primary role of scientists is extraction of fundamental knowledge from data, and we demonstrate that this extraction can be accelerated using neural networks via analysis of the trained data model itself rather than its application as a prediction tool. Convolutional neural networks excel at modeling complex data relationships in multi-dimensional parameter spaces, such as that mapped by a combinatorial materials science experiment. Measuring a performance metric in a given materials space provides direct information about (locally) optimal materials but not the underlying materials science that gives rise to the variation in performance. By building a model that predicts performance (in this case photoelectrochemical power generation of a solar fuels photoanode) from materials parameters (in this case composition and Raman signal), subsequent analysis of gradients in the trained model reveals key data relationships that are not readily identified by human inspection or traditional statistical analyses. Human interpretation of these key relationships produces the desired fundamental understanding, demonstrating a framework in which machine learning accelerates data interpretation by leveraging the expertize of the human scientist. We also demonstrate the use of neural network gradient analysis to automate prediction of the directions in parameter space, such as the addition of specific alloying elements, that may increase performance by moving beyond the confines of existing data.

[1]  H. Eckert,et al.  Vanadium(V) Environments in Bismuth Vanadates: A Structural Investigation Using Raman Spectroscopy and Solid State 51V NMR , 1991 .

[2]  R. M. Fleming,et al.  Discovery of a useful thin-film dielectric using a composition-spread approach , 1998, Nature.

[3]  Reddington,et al.  Combinatorial electrochemistry: A highly parallel, optical screening method for discovery of better electrocatalysts , 1998, Science.

[4]  Gao,et al.  Identification of a blue photoluminescent composite material from a combinatorial library , 1998, Science.

[5]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing , 2000 .

[6]  Virginia Reviewer-Teller,et al.  Review of Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition by Daniel Jurafsky and James H. Martin. Prentice Hall 2000. , 2000 .

[7]  P. Dorenbos Systematic behaviour in trivalent lanthanide charge transfer energies , 2003 .

[8]  S. Musić,et al.  Synthesis and characterisation of bismuth(III) vanadate , 2005 .

[9]  Krishna Rajan,et al.  Combinatorial Materials Sciences: Experimental Strategies for Accelerated Knowledge Discovery , 2008 .

[10]  Weifeng Yao,et al.  Effects of molybdenum substitution on the photocatalytic behavior of BiVO4. , 2008, Dalton transactions.

[11]  John Q. Gan,et al.  Low-level interpretability and high-level interpretability: a unified view of data-driven interpretable fuzzy system modelling , 2008, Fuzzy Sets Syst..

[12]  Sebastian Thrun,et al.  Towards fully autonomous driving: Systems and algorithms , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[13]  Anubhav Jain,et al.  Data mined ionic substitutions for the discovery of new compounds. , 2011, Inorganic chemistry.

[14]  Hong Wang,et al.  Phase transition, Raman spectra, infrared spectra, band gap and microwave dielectric properties of low temperature firing (Na0.5xBi1−0.5x)(MoxV1−x)O4 solid solution ceramics with scheelite structures , 2011 .

[15]  Tara N. Sainath,et al.  FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[16]  Tara N. Sainath,et al.  The shared views of four research groups ) , 2012 .

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  R. Bratschitsch,et al.  Assignment of the NV 0 575-nm zero-phonon line in diamond to a 2 E- 2 A 2 transition , 2013, 1301.3542.

[19]  R. Kondor,et al.  On representing chemical environments , 2012, 1209.3140.

[20]  Ichiro Takeuchi,et al.  Applications of high throughput (combinatorial) methodologies to electronic, magnetic, optical, and energy-related materials , 2013 .

[21]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[22]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[23]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[24]  A. Kassiba,et al.  Structural, electronic and optical features of molybdenum-doped bismuth vanadium oxide , 2015 .

[25]  I Takeuchi,et al.  High-throughput determination of structural phase diagram and constituent phases using GRENDEL , 2015, Nanotechnology.

[26]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[27]  Rahul Rao,et al.  Autonomy in materials research: a case study in carbon nanotube growth , 2016 .

[28]  Krishna Rajan,et al.  Information Science for Materials Discovery and Design , 2016 .

[29]  John M. Gregoire,et al.  Perspective: Composition–structure–property mapping in high-throughput experiments: Turning data into knowledge , 2016 .

[30]  Alok Choudhary,et al.  A General-Purpose Machine Learning Framework for Predicting Properties of Inorganic Materials , 2016 .

[31]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[32]  Lauren Tilton,et al.  Introduction to Natural Language Processing , 2016, DH.

[33]  James Theiler,et al.  Accelerated search for materials with targeted properties by adaptive design , 2016, Nature Communications.

[34]  Seiji Kajita,et al.  A Universal 3D Voxel Descriptor for Solid-State Material Informatics with Deep Convolutional Neural Networks , 2017, Scientific Reports.

[35]  Rama Vasudevan,et al.  Deep Learning of Atomically Resolved Scanning Transmission Electron Microscopy Images: Chemical Identification and Tracking Local Transformations. , 2017, ACS nano.

[36]  Luciano Floridi,et al.  Transparent, explainable, and accountable AI for robotics , 2017, Science Robotics.

[37]  Maxim Ziatdinov,et al.  Learning surface molecular structures via machine vision , 2017, npj Computational Materials.

[38]  S. Tajima,et al.  Characterization of large and small-plaque variants in the Zika virus clinical isolate ZIKV/Hu/S36/Chiba/2016 , 2017, Scientific Reports.

[39]  Ronan Le Bras,et al.  Automated Phase Mapping with AgileFD and its Application to Light Absorber Discovery in the V-Mn-Nb Oxide System. , 2016, ACS combinatorial science.

[40]  R. Asahi,et al.  Microstructure recognition using convolutional neural networks for prediction of ionic conductivity in ceramics , 2017 .

[41]  Matthias Troyer,et al.  Solving the quantum many-body problem with artificial neural networks , 2016, Science.

[42]  Erin Antono,et al.  Building Data-driven Models with Microstructural Images: Generalization and Interpretability , 2017, ArXiv.

[43]  A. Ludwig,et al.  Unraveling compositional effects on the light-induced oxygen evolution in Bi(V–Mo–X)O4 material libraries , 2017 .

[44]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[45]  Jie Yu,et al.  Solar fuels photoanode materials discovery by integrating high-throughput theory and experiment , 2017, Proceedings of the National Academy of Sciences.

[46]  Aron Walsh,et al.  The 2019 materials by design roadmap , 2018, Journal of physics D: Applied physics.

[47]  Shou-Cheng Zhang,et al.  Learning atoms for materials discovery , 2018, Proceedings of the National Academy of Sciences.

[48]  Thomas F. Miller,et al.  Transferability in Machine Learning for Electronic Structure via the Molecular Orbital Basis. , 2018, Journal of chemical theory and computation.

[49]  Jason R. Hattrick-Simpers,et al.  A simple constrained machine learning model for predicting high-pressure-hydrogen-compressor materials , 2018 .

[50]  S. Suram,et al.  Combinatorial alloying improves bismuth vanadate photoanodes via reduced monoclinic distortion , 2018 .

[51]  David Mascharka,et al.  Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52]  Corey Oses,et al.  Machine learning modeling of superconducting critical temperature , 2017, npj Computational Materials.

[53]  Cengiz Öztireli,et al.  Towards better understanding of gradient-based attribution methods for Deep Neural Networks , 2017, ICLR.

[54]  J. Gregoire,et al.  Multi-modal optimization of bismuth vanadate photoanodes via combinatorial alloying and hydrogen processing. , 2019, Chemical communications.

[55]  A. Davydov,et al.  Predicting synthesizability. , 2019, Journal of physics D: Applied physics.

[56]  Markus H. Gross,et al.  Gradient-Based Attribution Methods , 2019, Explainable AI.