AI Applications through the Whole Life Cycle of Material Discovery

Summary We provide a review of machine learning (ML) tools for material discovery and sophisticated applications of different ML strategies. Although there have been a few published reviews on artificial intelligence (AI) for materials with an emphasis on a single material system or individual methods, this paper focuses on an application-based perspective in AI-enhanced material discovery. It shows how AI strategies are applied through material discovery stages (including characterization, property prediction, synthesis, and theory paradigm discovery). Also, by referring to the ML tutorial, readers can acquire a better understanding of the exact functions of ML methods in each application and how these methods work to realize the targets. We are aiming to enable a better integration of AI methods with the material discovery process. The keys to successful applications of AI in material discovery and challenges to be addressed are also highlighted.

[1]  Liming Chen,et al.  Predicting the stability of ternary intermetallics with density functional theory and machine learning. , 2018, The Journal of chemical physics.

[2]  P. Shenai,et al.  Applications of Principal Component Analysis (PCA) in Materials Science , 2012 .

[3]  Hao Wu,et al.  VAMPnets for deep learning of molecular kinetics , 2017, Nature Communications.

[4]  Ashraf Uddin,et al.  Organic - Inorganic Hybrid Solar Cells: A Comparative Review , 2012 .

[5]  Satoru Masubuchi,et al.  Classifying optical microscope images of exfoliated graphene flakes by data-driven machine learning , 2019, npj 2D Materials and Applications.

[6]  Patrick Huck,et al.  Active learning for accelerated design of layered materials , 2018, npj Computational Materials.

[7]  Sotiris B. Kotsiantis,et al.  Decision trees: a recent overview , 2011, Artificial Intelligence Review.

[8]  Jennifer M. Rieser,et al.  Identifying structural flow defects in disordered solids using machine-learning methods. , 2014, Physical review letters.

[9]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[10]  John P. Overington,et al.  How many drug targets are there? , 2006, Nature Reviews Drug Discovery.

[11]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[12]  E. Gwinn,et al.  Fluorescence Color by Data-Driven Design of Genomic Silver Clusters. , 2018, ACS nano.

[13]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[14]  Kari Sentz,et al.  A new approach for quantifying morphological features of U3O8 for nuclear forensics using a deep learning model , 2019, Journal of Nuclear Materials.

[15]  Ambuj K. Singh,et al.  Base Motif Recognition and Design of DNA Templates for Fluorescent Silver Clusters by Machine Learning. , 2014, Advanced materials.

[16]  Jeffrey C. Grossman,et al.  Graph dynamical networks for unsupervised learning of atomic scale dynamics in materials , 2019, Nature Communications.

[17]  D. Sokaras,et al.  Designing Boron Nitride Islands in Carbon Materials for Efficient Electrochemical Synthesis of Hydrogen Peroxide. , 2018, Journal of the American Chemical Society.

[18]  M. Bauchy,et al.  Predicting the dissolution kinetics of silicate glasses using machine learning , 2017, 1712.06018.

[19]  Jakoah Brgoch,et al.  Disentangling Structural Confusion through Machine Learning: Structure Prediction and Polymorphism of Equiatomic Ternary Phases ABC. , 2017, Journal of the American Chemical Society.

[20]  Eric R. Homer,et al.  Discovering the building blocks of atomic systems using machine learning: application to grain boundaries , 2017, npj Computational Materials.

[21]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[22]  Yunchao Xie,et al.  Rapid Identification of X-ray Diffraction Spectra Based on Very Limited Data by Interpretable Convolutional Neural Networks , 2019, 1912.07750.

[23]  Andrew D. Orme,et al.  Insights into Twinning in Mg AZ31: A Combined EBSD and Machine Learning Study , 2016 .

[24]  Stefanie Jegelka,et al.  Virtual screening of inorganic materials synthesis parameters with deep learning , 2017, npj Computational Materials.

[25]  Andrew Zisserman,et al.  Crystal nucleation in metallic alloys using x-ray radiography and machine learning , 2018, Science Advances.

[26]  Anbupalam Thalamuthu,et al.  Gene expression Evaluation and comparison of gene clustering methods in microarray analysis , 2006 .

[27]  Dimitris C. Lagoudas,et al.  Multi-objective Bayesian materials discovery: Application on the discovery of precipitation strengthened NiTi shape memory alloys through micromechanical modeling , 2018, Materials & Design.

[28]  P. Rinke,et al.  Data‐Driven Materials Science: Status, Challenges, and Perspectives , 2019, Advanced science.

[29]  J. Buencuerpo,et al.  Solar cell designs by maximizing energy production based on machine learning clustering of spectral variations , 2018, Nature Communications.

[30]  Yuma Iwasaki,et al.  Machine-learning guided discovery of a new thermoelectric material , 2019, Scientific Reports.

[31]  R. Kondor,et al.  On representing chemical environments , 2012, 1209.3140.

[32]  R. Ramesh,et al.  Quantification of flexoelectricity in PbTiO3/SrTiO3 superlattice polar vortices using machine learning and phase-field modeling , 2017, Nature Communications.

[33]  S. Dudoit,et al.  A prediction-based resampling method for estimating the number of clusters in a dataset , 2002, Genome Biology.

[34]  Alán Aspuru-Guzik,et al.  The Harvard Clean Energy Project: Large-Scale Computational Screening and Design of Organic Photovoltaics on the World Community Grid , 2011 .

[35]  K. Müller,et al.  Towards exact molecular dynamics simulations with machine-learned force fields , 2018, Nature Communications.

[36]  Maxim Ziatdinov,et al.  Learning surface molecular structures via machine vision , 2017, npj Computational Materials.

[37]  Turab Lookman,et al.  Multi-objective Optimization for Materials Discovery via Adaptive Design , 2018, Scientific Reports.

[38]  Yan Wang,et al.  Designing High Dielectric Permittivity Material in Barium Titanate , 2017 .

[39]  Sreerama K. Murthy,et al.  Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey , 1998, Data Mining and Knowledge Discovery.

[40]  I. Park,et al.  Stretchable, Skin‐Mountable, and Wearable Strain Sensors and Their Potential Applications: A Review , 2016 .

[41]  Steven D. Lacey,et al.  Carbothermal shock synthesis of high-entropy-alloy nanoparticles , 2018, Science.

[42]  Chiho Kim,et al.  Machine learning in materials informatics: recent applications and prospects , 2017, npj Computational Materials.

[43]  L. Bezdetnaya,et al.  Drug delivery to solid tumors: the predictive value of the multicellular tumor spheroid model for nanomedicine screening , 2017, International journal of nanomedicine.

[44]  Carl Doersch,et al.  Tutorial on Variational Autoencoders , 2016, ArXiv.

[45]  Tonio Buonassisi,et al.  Accelerating Materials Development via Automation, Machine Learning, and High-Performance Computing , 2018, Joule.

[46]  Mohammad Rashidi,et al.  Autonomous Scanning Probe Microscopy in Situ Tip Conditioning through Machine Learning. , 2018, ACS nano.

[47]  Burr Settles,et al.  Active Learning , 2012, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[48]  Sergei V. Kalinin,et al.  Big-deep-smart data in imaging for guiding materials design. , 2015, Nature materials.

[49]  Age K. Smilde,et al.  Principal Component Analysis , 2003, Encyclopedia of Machine Learning.

[50]  J. Friedman Stochastic gradient boosting , 2002 .

[51]  Michele Ceriotti,et al.  Chemical shifts in molecular solids by machine learning , 2018, Nature Communications.

[52]  Yongmin Liu,et al.  Deep-Learning-Enabled On-Demand Design of Chiral Metamaterials. , 2018, ACS nano.

[53]  Alán Aspuru-Guzik,et al.  Inverse Design of Solid-State Materials via a Continuous Representation , 2019, Matter.

[54]  D. Jiang,et al.  Synthesis of Water-Soluble [Au25(SR)18]- Using a Stoichiometric Amount of NaBH4. , 2018, Journal of the American Chemical Society.

[55]  Tejs Vegge,et al.  Genetic algorithms for computational materials discovery accelerated by machine learning , 2019, npj Computational Materials.

[56]  Michael A Webb,et al.  Electronic structure at coarse-grained resolutions from supervised machine learning , 2019, Science Advances.

[57]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[58]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[59]  Andrew L. Ferguson,et al.  Mapping membrane activity in undiscovered peptide sequence space using machine learning , 2016, Proceedings of the National Academy of Sciences.

[60]  Jake Graser,et al.  Machine Learning and Energy Minimization Approaches for Crystal Structure Predictions: A Review and New Horizons , 2018 .

[61]  Turab Lookman,et al.  Active learning in materials science with emphasis on adaptive sampling using uncertainties for targeted design , 2019, npj Computational Materials.

[62]  Alvaro Sanchez-Gonzalez,et al.  Accurate prediction of X-ray pulse properties from a free-electron laser using machine learning , 2017, Nature Communications.

[63]  Y. Koyama,et al.  Predicting Materials Properties with Little Data Using Shotgun Transfer Learning , 2019, ACS central science.

[64]  Lawrence A. Adutwum,et al.  How To Optimize Materials and Devices via Design of Experiments and Machine Learning: Demonstration Using Organic Photovoltaics. , 2018, ACS nano.

[65]  Karl Wieghardt,et al.  Radical Ligands Confer Nobility on Base-Metal Catalysts , 2010, Science.

[66]  Alok Choudhary,et al.  Combinatorial screening for new materials in unconstrained composition space with machine learning , 2014 .

[67]  J. Campbell,et al.  Determining molecular properties with differential mobility spectrometry and machine learning , 2018, Nature Communications.

[68]  K. Müller,et al.  Machine Learning Predictions of Molecular Properties: Accurate Many-Body Potentials and Nonlocality in Chemical Space , 2015, The journal of physical chemistry letters.

[69]  Kristin A. Persson,et al.  Commentary: The Materials Project: A materials genome approach to accelerating materials innovation , 2013 .

[70]  Zachary W. Ulissi,et al.  Active learning across intermetallics to guide discovery of electrocatalysts for CO2 reduction and H2 evolution , 2018, Nature Catalysis.

[71]  Edgar Dutra Zanotto,et al.  Predicting glass transition temperatures using neural networks , 2018, Acta Materialia.

[72]  Vijay S. Pande,et al.  Low Data Drug Discovery with One-Shot Learning , 2016, ACS central science.

[73]  Paul Raccuglia,et al.  Machine-learning-assisted materials discovery using failed experiments , 2016, Nature.

[74]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[75]  Rama Vasudevan,et al.  Deep Learning of Atomically Resolved Scanning Transmission Electron Microscopy Images: Chemical Identification and Tracking Local Transformations. , 2017, ACS nano.

[76]  T. Lookman,et al.  Accelerated Discovery of Large Electrostrains in BaTiO3‐Based Piezoelectrics Using Active Learning , 2018, Advanced materials.

[77]  Alok Choudhary,et al.  Extracting Grain Orientations from EBSD Patterns of Polycrystalline Materials Using Convolutional Neural Networks , 2018, Microscopy and Microanalysis.

[78]  Sergei V. Kalinin,et al.  Deep learning of interface structures from the 4D STEM data: cation intermixing vs. roughening , 2020, 2002.09039.

[79]  Adrian E. Roitberg,et al.  Less is more: sampling chemical space with active learning , 2018, The Journal of chemical physics.

[80]  Daniel W. Davies,et al.  Machine learning for molecular and materials science , 2018, Nature.

[81]  Stefano Curtarolo,et al.  SISSO: A compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates , 2017, Physical Review Materials.

[82]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[83]  S. Kazarian,et al.  Applications of ATR-FTIR spectroscopic imaging to biomedical samples. , 2006, Biochimica et biophysica acta.

[84]  Takashi Miyake,et al.  Crystal structure prediction accelerated by Bayesian optimization , 2018 .

[85]  Qizhi Teng,et al.  Accelerating multi-point statistics reconstruction method for porous media via deep learning , 2018, Acta Materialia.

[86]  Jeffrey C Grossman,et al.  Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties. , 2017, Physical review letters.

[87]  Lihong Hu,et al.  Combined first-principles calculation and neural-network correction approach for heat of formation , 2003 .

[88]  Leroy Cronin,et al.  Controlling an organic synthesis robot with machine learning to search for new reactivity , 2018, Nature.

[89]  Christopher Wolverton,et al.  Accelerated discovery of metallic glasses through iteration of machine learning and high-throughput experiments , 2018, Science Advances.

[90]  Maciej Haranczyk,et al.  Capturing chemical intuition in synthesis of metal-organic frameworks , 2019, Nature Communications.

[91]  Andrea J. Liu,et al.  A structural approach to relaxation in glassy liquids , 2015, Nature Physics.

[92]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[93]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[94]  Stephen Jesse,et al.  Machine learning–enabled identification of material phase transitions based on experimental data: Exploring collective dynamics in ferroelectric relaxors , 2018, Science Advances.

[95]  Kipton Barros,et al.  Approaching coupled cluster accuracy with a general-purpose neural network potential through transfer learning , 2019, Nature Communications.

[96]  Jacek M. Zurada,et al.  Review and performance comparison of SVM- and ELM-based classifiers , 2014, Neurocomputing.

[97]  Kipton Barros,et al.  Optimisation of GaN LEDs and the reduction of efficiency droop using active machine learning , 2016, Scientific Reports.

[98]  M. Scheffler,et al.  Insightful classification of crystal structures using deep learning , 2017, Nature Communications.

[99]  Stephen L. Sass,et al.  The Substance of Civilization: Materials and Human History from the Stone Age to the Age of Silicon , 1998 .

[100]  W. Park,et al.  Classification of crystal structure using a convolutional neural network , 2017, IUCrJ.

[101]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[102]  Olga Kononova,et al.  Unsupervised word embeddings capture latent knowledge from materials science literature , 2019, Nature.

[103]  D. J. Berrisford,et al.  Ligand‐Accelerated Catalysis , 1995 .

[104]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[105]  Giorgos Borboudakis,et al.  Chemically intuited, large-scale screening of MOFs by machine learning techniques , 2017, npj Computational Materials.

[106]  Zenghui Wang,et al.  Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review , 2017, Neural Computation.

[107]  S. Qin,et al.  Selection of the Number of Principal Components: The Variance of the Reconstruction Error Criterion with a Comparison to Other Methods† , 1999 .

[108]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[109]  Razvan Pascanu,et al.  On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.

[110]  Anil K. Jain,et al.  Artificial Neural Networks: A Tutorial , 1996, Computer.

[111]  Anubhav Jain,et al.  Computational predictions of energy materials using density functional theory , 2016 .

[112]  Marco Buongiorno Nardelli,et al.  The AFLOW standard for high-throughput materials science calculations , 2015, 1506.00303.

[113]  Liuqing Peng,et al.  CVAP: Validation for Cluster Analyses , 2009, Data Sci. J..

[114]  Julia Ling,et al.  Single-Crystal Automated Refinement (SCAR): A Data-Driven Method for Determining Inorganic Structures. , 2019, Inorganic chemistry.

[115]  Jianlin Cheng,et al.  Bandgap prediction by deep learning in configurationally hybridized graphene and boron nitride , 2019, npj Computational Materials.

[116]  Qianxiao Li,et al.  Embedding physics domain knowledge into a Bayesian network enables layer-by-layer process innovation for photovoltaics , 2019, npj Computational Materials.

[117]  Zachary Chase Lipton A Critical Review of Recurrent Neural Networks for Sequence Learning , 2015, ArXiv.

[118]  Jinlan Wang,et al.  Accelerated discovery of stable lead-free hybrid organic-inorganic perovskites via machine learning , 2018, Nature Communications.

[119]  Yoshua Bengio,et al.  Gated Feedback Recurrent Neural Networks , 2015, ICML.

[120]  Audrey Durand,et al.  A machine learning approach for online automated optimization of super-resolution optical microscopy , 2018, Nature Communications.

[121]  A. McCallum,et al.  Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning , 2017 .

[122]  T. Kondo,et al.  Active sites of nitrogen-doped carbon materials for oxygen reduction reaction clarified using model catalysts , 2016, Science.

[123]  Juho Kannala,et al.  Automated structure discovery in atomic force microscopy , 2019, Science Advances.

[124]  Jiali Li,et al.  Deep Learning Accelerated Gold Nanocluster Synthesis , 2018, Adv. Intell. Syst..

[125]  Taylor D. Sparks,et al.  High-Throughput Machine-Learning-Driven Synthesis of Full-Heusler Compounds , 2016 .

[126]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[127]  Kevin M. Ryan,et al.  Crystal Structure Prediction via Deep Learning. , 2018, Journal of the American Chemical Society.

[128]  Zhijian Liu,et al.  Application of Artificial Neural Networks for Catalysis: A Review , 2017 .

[129]  Jakoah Brgoch,et al.  Identifying an efficient, thermally robust inorganic phosphor host via machine learning , 2018, Nature Communications.

[130]  Taylor D. Sparks,et al.  Perspective: Web-based machine learning models for real-time screening of thermoelectric materials properties , 2016 .

[131]  Matthias Rupp,et al.  Assessing the Frontier: Active Learning, Model Accuracy, and Multi-objective Materials Discovery and Optimization , 2019, ArXiv.

[132]  Turab Lookman,et al.  Experimental search for high-temperature ferroelectric perovskites guided by two-step machine learning , 2018, Nature Communications.

[133]  Savitha Ramasamy,et al.  Fast and interpretable classification of small X-ray diffraction datasets using data augmentation and deep neural networks , 2018, npj Computational Materials.

[134]  Jihan Kim,et al.  Inverse design of porous materials using artificial neural networks , 2020, Science Advances.

[135]  Chiho Kim,et al.  Machine Learning Assisted Predictions of Intrinsic Dielectric Breakdown Strength of ABX3 Perovskites , 2016 .

[136]  Steven K. Kauwe,et al.  Machine Learning Prediction of Heat Capacity for Solid Inorganics , 2018, Integrating Materials and Manufacturing Innovation.

[137]  Guillermo Avendaño-Franco,et al.  Machine-Learning Prediction of CO Adsorption in Thiolated, Ag-Alloyed Au Nanoclusters. , 2018, Journal of the American Chemical Society.

[138]  M. Emre Celebi,et al.  Partitional Clustering Algorithms , 2014 .

[139]  Matthias Rupp,et al.  Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach. , 2015, Journal of chemical theory and computation.

[140]  Alpha A. Lee,et al.  Predicting materials properties without crystal structure: deep representation learning from stoichiometry , 2020, Nature communications.

[141]  J. Gregoire,et al.  Analyzing machine learning models to accelerate generation of fundamental materials insights , 2019, npj Computational Materials.

[142]  Zhihua Wei,et al.  Mixed Pooling for Convolutional Neural Networks , 2014, RSKT.

[143]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[144]  Elizabeth A. Holm,et al.  Microstructure Cluster Analysis with Transfer Learning and Unsupervised Learning , 2018, Integrating Materials and Manufacturing Innovation.

[145]  Alán Aspuru-Guzik,et al.  Accelerating the discovery of materials for clean energy in the era of smart automation , 2018, Nature Reviews Materials.

[146]  Alán Aspuru-Guzik,et al.  A Mixed Quantum Chemistry/Machine Learning Approach for the Fast and Accurate Prediction of Biochemical Redox Potentials and Its Large-Scale Application to 315 000 Redox Reactions , 2019, ACS central science.

[147]  Zhi-Hua Zhou,et al.  A brief introduction to weakly supervised learning , 2018 .

[148]  Regina Barzilay,et al.  Prediction of Organic Reaction Outcomes Using Machine Learning , 2017, ACS central science.

[149]  Yan Yang,et al.  Dimension Reduction With Extreme Learning Machine , 2016, IEEE Transactions on Image Processing.

[150]  Yang Wang,et al.  Prediction of higher-selectivity catalysts by computer-driven workflow and machine learning , 2019, Science.

[151]  Keisuke Takahashi,et al.  Material synthesis and design from first principle calculations and machine learning , 2016 .

[152]  Sergei V. Kalinin,et al.  Deep neural networks for understanding noisy data applied to physical property extraction in scanning probe microscopy , 2019, npj Computational Materials.