A machine learning framework for the analysis and prediction of catalytic activity from experimental data

Abstract We present a machine learning framework to explore the predictability limits of catalytic activity from experimental descriptor data (which characterizes catalyst formulations and reaction conditions). Artificial neural networks are used to fuse descriptor data to predict activity and we use principal component analysis (PCA) and sparse PCA to project the experimental data into an information space and with this identify regions that exhibit low- and high-predictability. Our framework also incorporates a constrained-PCA optimization formulation that identifies new experimental points while filtering out regions in the experimental space due to constraints on technology, economics, and expert knowledge. This allows us to navigate the experimental space in a more targeted manner. Our framework is applied to a comprehensive water-gas shift reaction data set, which contains 2,228 experimental data points collected from the literature. Neural network analysis reveals strong predictability of activity across reaction conditions (e.g., varying temperature) but also reveals important gaps in predictability across catalyst formulations (e.g., varying metal, support, and promoter). PCA analysis reveals that these gaps are due to the fact that most experiments reported in the literature lie within narrow regions in the information space. We demonstrate that our framework can systematically guide experiments and the selection of descriptors in order to improve predictability and identify new promising formulations.

[1]  Gadi Rothenberg,et al.  Predicting adsorption on metals: simple yet effective descriptors for surface catalysis. , 2013, Physical chemistry chemical physics : PCCP.

[2]  Rasmus Larsen,et al.  SpaSM: A MATLAB Toolbox for Sparse Statistical Modeling , 2018 .

[3]  M. Flytzani-Stephanopoulos,et al.  Active Nonmetallic Au and Pt Species on Ceria-Based Water-Gas Shift Catalysts , 2003, Science.

[4]  Philipp Müller,et al.  Developing a Descriptor-Based Approach for CO and NO Adsorption Strength to Transition Metal Sites in Zeolites , 2017 .

[5]  Harvey G. Stenger,et al.  Water gas shift reaction kinetics and reactor modeling for fuel cell grade hydrogen , 2003 .

[6]  G. Lewicki,et al.  Approximation by Superpositions of a Sigmoidal Function , 2003 .

[7]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[8]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, NIPS 2004.

[9]  I. Jolliffe Principal Component Analysis , 2002 .

[10]  Manos Mavrikakis,et al.  On the mechanism of low-temperature water gas shift reaction on copper. , 2008, Journal of the American Chemical Society.

[11]  John R. Kitchin,et al.  Machine learning in catalysis , 2018, Nature Catalysis.

[12]  Yurii Nesterov,et al.  Generalized Power Method for Sparse Principal Component Analysis , 2008, J. Mach. Learn. Res..

[13]  Ali Hussain Motagamwala,et al.  Microkinetic Analysis and Scaling Relations for Catalyst Design. , 2018, Annual review of chemical and biomolecular engineering.

[14]  Charles T. Campbell,et al.  A kinetic model of the water gas shift reaction , 1992 .

[15]  Luke E K Achenie,et al.  Machine-Learning-Augmented Chemisorption Model for CO2 Electroreduction Catalyst Screening. , 2015, The journal of physical chemistry letters.

[16]  Anubhav Jain,et al.  Finding Nature′s Missing Ternary Oxide Compounds Using Machine Learning and Density Functional Theory. , 2010 .

[17]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[18]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[19]  Gilbert F. Froment,et al.  A Steady-State Kinetic Model for Methanol Synthesis and the Water Gas Shift Reaction on a Commercial Cu/ZnO/Al2O3 Catalyst , 1996 .

[20]  Maria Flytzani-Stephanopoulos,et al.  Low-temperature water-gas shift reaction over Cu- and Ni-loaded cerium oxide catalysts , 2000 .

[21]  Zachary W. Ulissi,et al.  Active learning across intermetallics to guide discovery of electrocatalysts for CO2 reduction and H2 evolution , 2018, Nature Catalysis.

[22]  R. Carlson,et al.  Design and optimization in organic synthesis , 1991 .

[23]  J. Nørskov,et al.  Computational high-throughput screening of electrocatalytic materials for hydrogen evolution , 2006, Nature materials.

[24]  Andrew J. Medford,et al.  Extracting Knowledge from Data through Catalysis Informatics , 2018, ACS Catalysis.

[25]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[26]  Daniel Svozil,et al.  Introduction to multi-layer feed-forward neural networks , 1997 .

[27]  Qi Zheng,et al.  Effect of yttrium addition on water-gas shift reaction over CuO/CeO2 catalysts , 2009 .

[28]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[29]  M. Erdem Günay,et al.  Knowledge extraction for water gas shift reaction over noble metal catalysts from publications in the literature between 2002 and 2012 , 2014 .

[30]  Zachary W. Ulissi,et al.  To address surface reaction network complexity using scaling relations machine learning and DFT calculations , 2017, Nature Communications.

[31]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.