The Open Catalyst 2022 (OC22) Dataset and Challenges for Oxide Electrocatalysis

Computational catalysis and machine learning communities have made considerable progress in developing machine learning models for catalyst discovery and design. Yet, a general machine learning potential that spans the chemical space of catalysis is still out of reach. A significant hurdle is obtaining access to training data across a wide range of materials. One important class of materials where data is lacking are oxides, which inhibits models from studying the Oxygen Evolution Reaction (OER) and oxide electrocatalysis more generally. To address this we developed the Open Catalyst 2022 (OC22) dataset, consisting of 62,521 Density Functional Theory (DFT) relaxations ( ∼ 9,884,504 single point calculations) across a range of oxide materials, coverages, and adsorbates (*H, *O, *N, *C, *OOH, *OH, *OH 2 , *O 2 , *CO). We define generalized tasks to predict the total system energy that are applicable across catalysis, de-velop baseline performance of several graph neural networks (SchNet, DimeNet++, ForceNet, SpinConv, PaiNN, GemNet-dT, GemNet-OC), and provide pre-defined dataset splits to estab-lish clear benchmarks for future efforts. For all tasks, we study whether combining datasets leads to better results, even if they contain different materials or adsorbates. Specifically, we jointly train models on Open Catalyst 2020 Dataset (OC20) and OC22, or fine-tune pretrained OC20 models on OC22. In the most general task, GemNet-OC sees a ∼ 32% improvement in energy predictions through fine-tuning and a ∼ 9% improvement in force predictions via joint training. Surprisingly, joint training on both the OC20 and much smaller OC22 datasets also improves total energy predictions on OC20 by ∼ 19%. The dataset and baseline models are open sourced, and a public leaderboard will follow to encourage continued community developments on the total energy tasks and data.

[1]  Zachary W. Ulissi,et al.  FINETUNA: fine-tuning accelerated molecular simulations , 2022, Mach. Learn. Sci. Technol..

[2]  Zachary W. Ulissi,et al.  Transfer learning using attentions across atomic systems with graph neural networks (TAAG). , 2022, The Journal of chemical physics.

[3]  Zachary W. Ulissi,et al.  How Do Graph Networks Generalize to Large and Diverse Molecular Systems? , 2022, ArXiv.

[4]  Brandon M. Wood,et al.  Towards Training Billion Parameter Graph Neural Networks for Atomic Simulations , 2022, ICLR.

[5]  Chenru Duan,et al.  Two Wrongs Can Make a Right: A Transfer Learning Approach for Chemical Discovery with Chemical Accuracy , 2022, ArXiv.

[6]  P. Battaglia,et al.  Simple GNN Regularisation for 3D Molecular Property Prediction&Beyond , 2021, 2106.07971.

[7]  H. Hansen,et al.  Acid-Stable and Active M–N–C Catalysts for the Oxygen Reduction Reaction: The Role of Local Structure , 2021, ACS Catalysis.

[8]  Jörg Behler,et al.  Machine learning potentials for extended systems: a perspective , 2021, The European Physical Journal B.

[9]  C. Lawrence Zitnick,et al.  Rotation Invariant Graph Neural Networks using Spin Convolutions , 2021, ArXiv.

[10]  K. Reuter,et al.  Adsorption Enthalpies for Catalysis Modeling through Machine-Learned Descriptors. , 2021, Accounts of chemical research.

[11]  Florian Becker,et al.  GemNet: Universal Directional Graph Neural Networks for Molecules , 2021, NeurIPS.

[12]  Xuedan Song,et al.  Recent Advances of CeO 2 ‐Based Electrocatalysts for Oxygen and Hydrogen Evolution as well as Nitrogen Reduction , 2021 .

[13]  I. Evazzade,et al.  Ab Initio Thermodynamics and Kinetics of the Lattice Oxygen Evolution Reaction in Iridium Oxides , 2021 .

[14]  Shuiwang Ji,et al.  Spherical Message Passing for 3D Graph Networks , 2021, ArXiv.

[15]  Jing Chen,et al.  Adsorption energy as a promising single-parameter descriptor for single atom catalysis in the oxygen evolution reaction , 2021 .

[16]  Michael Gastegger,et al.  Equivariant message passing for the prediction of tensorial properties and molecular spectra , 2021, ICML.

[17]  Jonas A. Finkler,et al.  General-Purpose Machine Learning Potentials Capturing Nonlocal Charge Transfer. , 2021, Accounts of chemical research.

[18]  N. Marković,et al.  Dynamically Stable Active Sites from Surface Evolution of Perovskite Materials during the Oxygen Evolution Reaction. , 2021, Journal of the American Chemical Society.

[19]  Engineering,et al.  The Breakdown of Mott Physics at VO$_2$ Surfaces. , 2020, 2012.05306.

[20]  J. Nørskov,et al.  Analysis of Acid-Stable and Active Oxides for the Oxygen Evolution Reaction , 2020 .

[21]  C. Lawrence Zitnick,et al.  The Open Catalyst 2020 (OC20) Dataset and Community Challenges , 2020, Proceedings of the International Conference on Electrocatalysis for Energy Applications and Sustainable Chemicals.

[22]  Lei Huang,et al.  Normalization Techniques in Training DNNs: Methodology, Analysis and Application , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Zhenbin Wang,et al.  Acid-Stable Oxides for Oxygen Electrocatalysis , 2020 .

[24]  R. Rousseau,et al.  Theoretical insights into the surface physics and chemistry of redox-active oxides , 2020, Nature Reviews Materials.

[25]  Zachary W. Ulissi,et al.  Discovery of Acid-Stable Oxygen Evolution Catalysts: High-Throughput Computational Screening of Equimolar Bimetallic Oxides. , 2020, ACS applied materials & interfaces.

[26]  Stephan Günnemann,et al.  Directional Message Passing for Molecular Graphs , 2020, ICLR.

[27]  Jian-guo Tang,et al.  A review on non-noble metal based electrocatalysis for the oxygen evolution reaction , 2020 .

[28]  G. C. Dhal,et al.  Cerium catalysts applications in carbon monoxide oxidations , 2020 .

[29]  B. Sumpter,et al.  Artificial neural network correction for density-functional tight-binding molecular dynamics simulations , 2019, MRS Communications.

[30]  Colin A. Grambow,et al.  Accurate Thermochemistry with Small Data Sets: A Bond Additivity Correction and Transfer Learning Approach. , 2019, The journal of physical chemistry. A.

[31]  Kipton Barros,et al.  Approaching coupled cluster accuracy with a general-purpose neural network potential through transfer learning , 2019, Nature Communications.

[32]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[33]  J. Védrine Metal Oxides in Heterogeneous Oxidation Catalysis: State of the Art and Challenges for a More Sustainable World. , 2019, ChemSusChem.

[34]  Javier Heras-Domingo,et al.  Interaction between Ruthenium Oxide Surfaces and Water Molecules. Effect of Surface Morphology and Water Coverage , 2018, The Journal of Physical Chemistry C.

[35]  B. Liu,et al.  Oxygen Vacancy Promoting Dimethyl Carbonate Synthesis from CO2 and Methanol over Zr-Doped CeO2 Nanorods , 2018, ACS Catalysis.

[36]  Chenghua Sun,et al.  Promoting Oxygen Evolution Reactions through Introduction of Oxygen Vacancies to Benchmark NiFe–OOH Catalysts , 2018, ACS Energy Letters.

[37]  Claudia Draxl,et al.  NOMAD: The FAIR concept for big data-driven materials science , 2018, MRS Bulletin.

[38]  Noam Bernstein,et al.  Machine Learning a General-Purpose Interatomic Potential for Silicon , 2018, Physical Review X.

[39]  Qingguo Huang,et al.  Electrochemical oxidation of PFOA and PFOS in concentrated waste streams , 2018 .

[40]  Jeffrey C Grossman,et al.  Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties. , 2017, Physical review letters.

[41]  V. Viswanathan,et al.  Quantifying Confidence in DFT-Predicted Surface Pourbaix Diagrams of Transition-Metal Electrode-Electrolyte Interfaces. , 2017, Langmuir : the ACS journal of surfaces and colloids.

[42]  Michael Walter,et al.  The atomic simulation environment-a Python library for working with atoms. , 2017, Journal of physics. Condensed matter : an Institute of Physics journal.

[43]  Andrew J. Medford,et al.  Analysis of Photocatalytic Nitrogen Fixation on Rutile TiO$_2$(110) , 2017, 1707.03031.

[44]  Klaus-Robert Müller,et al.  SchNet: A continuous-filter convolutional neural network for modeling quantum interactions , 2017, NIPS.

[45]  Younes Abghoui,et al.  Electrochemical synthesis of ammonia via Mars-van Krevelen mechanism on the (111) facets of group III–VII transition metal mononitrides , 2017 .

[46]  Zhi Wei,et al.  Transfer Learning Approaches to Improve Drug Sensitivity Prediction in Multiple Myeloma Patients , 2017, IEEE Access.

[47]  H. Grönbeck,et al.  Adsorbate Pairing on Oxide Surfaces: Influence on Reactivity and Dependence on Oxide, Adsorbate Pair, and Density Functional , 2017 .

[48]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[49]  Colin F. Dickens,et al.  Combining theory and experiment in electrocatalysis: Insights into materials design , 2017, Science.

[50]  J. Behler Perspective: Machine learning potentials for atomistic simulations. , 2016, The Journal of chemical physics.

[51]  Matthias Rupp,et al.  Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach. , 2015, Journal of chemical theory and computation.

[52]  Muratahan Aykol,et al.  Materials Design and Discovery with High-Throughput Density Functional Theory: The Open Quantum Materials Database (OQMD) , 2013 .

[53]  Anubhav Jain,et al.  Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis , 2012 .

[54]  Anubhav Jain,et al.  Formation enthalpies by mixing GGA and GGA + U calculations , 2011 .

[55]  Thomas Bligaard,et al.  Density functional theory in surface chemistry and catalysis , 2011, Proceedings of the National Academy of Sciences.

[56]  Christian Limberg,et al.  The Mechanism of Water Oxidation: From Electrolysis via Homogeneous to Biological Catalysis , 2010 .

[57]  J. Gracia,et al.  Mars-van Krevelen-like Mechanism of CO Hydrogenation on an Iron Carbide Surface , 2009 .

[58]  Qiang Yang,et al.  EigenTransfer: a unified framework for transfer learning , 2009, ICML '09.

[59]  J. Nørskov,et al.  Ligand effects in heterogeneous catalysis and electrochemistry , 2007 .

[60]  Michele Parrinello,et al.  Generalized neural-network representation of high-dimensional potential-energy surfaces. , 2007, Physical review letters.

[61]  J. Nørskov,et al.  Universality in Heterogeneous Catalysis , 2002 .

[62]  Steven G. Johnson,et al.  Block-iterative frequency-domain methods for Maxwell's equations in a planewave basis. , 2001, Optics express.

[63]  G. Kresse,et al.  From ultrasoft pseudopotentials to the projector augmented-wave method , 1999 .

[64]  Burke,et al.  Generalized Gradient Approximation Made Simple. , 1996, Physical review letters.

[65]  Kresse,et al.  Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. , 1996, Physical review. B, Condensed matter.

[66]  G. Kresse,et al.  Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set , 1996 .

[67]  Hafner,et al.  Ab initio molecular-dynamics simulation of the liquid-metal-amorphous-semiconductor transition in germanium. , 1994, Physical review. B, Condensed matter.

[68]  A. Zunger,et al.  A new method for diagonalising large matrices , 1985 .

[69]  S. Trasatti Electrocatalysis by oxides — Attempt at a unifying approach , 1980 .

[70]  P. Pulay Convergence acceleration of iterative sequences. the case of scf iteration , 1980 .

[71]  P. Mars,et al.  Oxidations carried out by means of vanadium oxide catalysts , 1954 .

[72]  JAMES BELL,et al.  Advances in Catalysis , 1953, Nature.