Forces are not Enough: Benchmark and Critical Evaluation for Machine Learning Force Fields with Molecular Simulations

Molecular dynamics (MD) simulation techniques are widely used for various natural science applications. Increasingly, machine learning (ML) force field (FF) models begin to replace ab-initio simulations by predicting forces directly from atomic structures. Despite significant progress in this area, such techniques are primarily benchmarked by their force/energy prediction errors, even though the practical use case would be to produce realistic MD trajectories. We aim to fill this gap by introducing a novel benchmark suite for learned MD simulation. We curate representative MD systems, including water, organic molecules, a peptide, and materials, and design evaluation metrics corresponding to the scientific objectives of respective systems. We benchmark a collection of state-of-the-art (SOTA) ML FF models and illustrate, in particular, how the commonly benchmarked force accuracy is not well aligned with relevant simulation metrics. We demonstrate when and how selected SOTA methods fail, along with offering directions for further improvement. Specifically, we identify stability as a key metric for ML models to improve. Our benchmark suite comes with a comprehensive open-source codebase for training and simulation with ML FFs to facilitate future work.

[1]  Johannes T. Margraf,et al.  How robust are modern graph neural network potentials in long and hot molecular dynamics simulations? , 2022, Mach. Learn. Sci. Technol..

[2]  Benjamin P. Pritchard,et al.  SPICE, A Dataset of Drug-like Molecules and Peptides for Training Machine Learning Potentials , 2022, Scientific Data.

[3]  Wujie Wang,et al.  Learning Pair Potentials using Differentiable Simulations , 2022, The Journal of chemical physics.

[4]  A. Tkatchenko,et al.  Towards Linearly Scaling and Chemically Accurate Global Machine Learning Force Fields for Large Molecules , 2022, 2209.03985.

[5]  Yi Liao,et al.  Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs , 2022, ICLR.

[6]  Zachary W. Ulissi,et al.  The Open Catalyst 2022 (OC22) Dataset and Challenges for Oxide Electrocatalysis , 2022, ACS Catalysis.

[7]  Zachary W. Ulissi,et al.  Open Challenges in Developing Generalizable Large-Scale Machine-Learning Models for Catalyst Discovery , 2022, ACS Catalysis.

[8]  Simon L. Batzner,et al.  The Design Space of E(3)-Equivariant Atom-Centered Interatomic Potentials , 2022, ArXiv.

[9]  Simon L. Batzner,et al.  Learning local equivariant representations for large-scale atomistic dynamics , 2022, Nature Communications.

[10]  Zachary W. Ulissi,et al.  GemNet-OC: Developing Graph Neural Networks for Large and Diverse Molecular Simulation Datasets , 2022, Trans. Mach. Learn. Res..

[11]  Oliver T. Unke,et al.  Automatic identification of chemical moieties , 2022, Physical chemistry chemical physics : PCCP.

[12]  Chi Chen,et al.  A universal graph deep learning interatomic potential for the periodic table , 2022, Nature Computational Science.

[13]  Cas van der Oord,et al.  Linear Atomic Cluster Expansion Force Fields for Organic Molecules: Beyond RMSE , 2021, Journal of chemical theory and computation.

[14]  Gábor Csányi,et al.  Linear Atomic Cluster Expansion Force Fields for Organic Molecules: beyond RMSE , 2021 .

[15]  Toshiki Kataoka,et al.  Towards universal neural network potential for material discovery applicable to arbitrary combination of 45 elements , 2021, Nature Communications.

[16]  P. Battaglia,et al.  Simple GNN Regularisation for 3D Molecular Property Prediction&Beyond , 2021, 2106.07971.

[17]  Cecilia Clementi,et al.  Machine learning implicit solvation for molecular dynamics. , 2021, The Journal of chemical physics.

[18]  Florian Becker,et al.  GemNet: Universal Directional Graph Neural Networks for Molecules , 2021, NeurIPS.

[19]  Julija Zavadlav,et al.  Learning neural network potentials from experimental data via Differentiable Trajectory Reweighting , 2021, Nature Communications.

[20]  Jonathan P. Mailoa,et al.  Accurate and scalable graph neural network force field and molecular dynamics with direct force architecture , 2021, npj Computational Materials.

[21]  Klaus-Robert Müller,et al.  SpookyNet: Learning force fields with electronic degrees of freedom and nonlocal effects , 2021, Nature Communications.

[22]  Justin S. Smith,et al.  Modeling of Peptides with Classical and Novel Machine Learning Force Fields: A Comparison. , 2021, The journal of physical chemistry. B.

[23]  J. Leskovec,et al.  ForceNet: A Graph Neural Network for Large-Scale Quantum Calculations , 2021, ArXiv.

[24]  Max Welling,et al.  E(n) Equivariant Graph Neural Networks , 2021, ICML.

[25]  Joe G Greener,et al.  Differentiable molecular simulation can learn all the parameters in a coarse-grained force field for proteins , 2021, bioRxiv.

[26]  Michael Gastegger,et al.  Equivariant message passing for the prediction of tensorial properties and molecular spectra , 2021, ICML.

[27]  Rafael Gómez-Bombarelli,et al.  Differentiable sampling of molecular geometries with uncertainty-based adversarial attacks , 2021, Nature Communications.

[28]  R. Car,et al.  When do short-range atomistic machine-learning models fall short? , 2021, The Journal of chemical physics.

[29]  Jeremiah A. Johnson,et al.  Accelerating amorphous polymer electrolyte screening by learning to reduce errors in molecular dynamics simulated properties , 2021, Nature Communications.

[30]  Jonathan P. Mailoa,et al.  E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials , 2021, Nature Communications.

[31]  Toni Giorgino,et al.  TorchMD: A Deep Learning Framework for Molecular Simulations , 2020, Journal of chemical theory and computation.

[32]  E. D. Cubuk,et al.  JAX, M.D. A framework for differentiable physics , 2020, NeurIPS.

[33]  Weihua Hu,et al.  The Open Catalyst 2020 (OC20) Dataset and Community Challenges , 2020, ACS Catalysis.

[34]  Michael Gastegger,et al.  Machine Learning Force Fields , 2020, Chemical reviews.

[35]  A. Tkatchenko,et al.  QM7-X: A comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules , 2020, 2006.15139.

[36]  Wujie Wang,et al.  Active learning and neural network potentials accelerate molecular screening of ether-based solvate ionic liquids. , 2020, Chemical communications.

[37]  Stephan Günnemann,et al.  Directional Message Passing for Molecular Graphs , 2020, ICLR.

[38]  Andrew L. Ferguson,et al.  Machine learning for collective variable discovery and enhanced sampling in biomolecular simulation , 2020, Molecular Physics.

[39]  Wujie Wang,et al.  Differentiable Molecular Simulations for Control and Learning , 2020, ArXiv.

[40]  Jure Leskovec,et al.  Learning to Simulate Complex Physics with Graph Networks , 2020, ICML.

[41]  Patrick La Riviere,et al.  Transforming the development and dissemination of cutting-edge microscopy and computation , 2019, Nature Methods.

[42]  Junmei Wang,et al.  End-Point Binding Free Energy Calculation with MM/PBSA and MM/GBSA: Strategies and Applications in Drug Design. , 2019, Chemical reviews.

[43]  Simon L. Batzner,et al.  On-the-fly active learning of interpretable Bayesian force fields for atomistic rare events , 2019, npj Computational Materials.

[44]  Wujie Wang,et al.  Coarse-graining auto-encoders for molecular dynamics , 2018, npj Computational Materials.

[45]  Frank Noé,et al.  Machine Learning of Coarse-Grained Molecular Dynamics Force Fields , 2018, ACS central science.

[46]  Debora S. Marks,et al.  Learning Protein Structure with a Differentiable Simulator , 2018, ICLR.

[47]  E Weinan,et al.  End-to-end Symmetry Preserving Inter-atomic Potential Energy Model for Finite and Extended Systems , 2018, NeurIPS.

[48]  Markus Meuwly,et al.  A reactive, scalable, and transferable model for molecular energies from a neural network approach based on local information. , 2018, The Journal of chemical physics.

[49]  Li Li,et al.  Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds , 2018, ArXiv.

[50]  Mohammad M. Sultan,et al.  Transferable Neural Networks for Enhanced Sampling of Protein Dynamics. , 2018, Journal of chemical theory and computation.

[51]  Mark E Tuckerman,et al.  Stochastic Neural Network Approach for Learning High-Dimensional Free Energy Surfaces. , 2017, Physical review letters.

[52]  George E. Dahl,et al.  Prediction Errors of Molecular Machine Learning Models Lower than Hybrid DFT Error. , 2017, Journal of chemical theory and computation.

[53]  E Weinan,et al.  Deep Potential Molecular Dynamics: a scalable model with the accuracy of quantum mechanics , 2017, Physical review letters.

[54]  Michael Walter,et al.  The atomic simulation environment-a Python library for working with atoms. , 2017, Journal of physics. Condensed matter : an Institute of Physics journal.

[55]  Klaus-Robert Müller,et al.  SchNet: A continuous-filter convolutional neural network for modeling quantum interactions , 2017, NIPS.

[56]  Gerbrand Ceder,et al.  Efficient and accurate machine-learning interpolation of atomic energies in compositions with many species , 2017, 1706.06293.

[57]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[58]  Vijay S. Pande,et al.  OpenMM 7: Rapid development of high performance algorithms for molecular dynamics , 2016, bioRxiv.

[59]  Klaus-Robert Müller,et al.  Machine learning of accurate energy-conserving molecular force fields , 2016, Science Advances.

[60]  Alireza Khorshidi,et al.  Amp: A modular approach to machine learning in atomistic simulations , 2016, Comput. Phys. Commun..

[61]  Berk Hess,et al.  GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers , 2015 .

[62]  Brett M. Savoie,et al.  Systematic Computational and Experimental Investigation of Lithium-Ion Transport Mechanisms in Polyester-Based Polymer Electrolytes , 2015, ACS central science.

[63]  Pavlo O. Dral,et al.  Quantum chemistry structures and properties of 134 kilo molecules , 2014, Scientific Data.

[64]  Massimiliano Bonomi,et al.  PLUMED 2: New feathers for an old bird , 2013, Comput. Phys. Commun..

[65]  Kyle A. Beauchamp,et al.  Markov state model reveals folding and functional dynamics in ultra-long MD trajectories. , 2011, Journal of the American Chemical Society.

[66]  R. Dror,et al.  How Fast-Folding Proteins Fold , 2011, Science.

[67]  M. Tuckerman Statistical Mechanics: Theory and Molecular Simulation , 2010 .

[68]  Charles L. Brooks,et al.  A theoretical study of alanine dipeptide and analogs , 2009 .

[69]  M. Feig Kinetics from Implicit Solvent Simulations of Biomolecules as a Function of Viscosity. , 2007, Journal of chemical theory and computation.

[70]  Michele Parrinello,et al.  Generalized neural-network representation of high-dimensional potential-energy surfaces. , 2007, Physical review letters.

[71]  G. Voth,et al.  Flexible simple point-charge water model with improved liquid-state properties. , 2006, The Journal of chemical physics.

[72]  Julian Tirado-Rives,et al.  Potential energy functions for atomic-level simulations of water and organic and biomolecular systems. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[73]  A. Laio,et al.  Escaping free-energy minima , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[74]  R. Friesner,et al.  Evaluation and Reparametrization of the OPLS-AA Force Field for Proteins via Comparison with Accurate Quantum Chemical Calculations on Peptides† , 2001 .

[75]  Berend Smit,et al.  Understanding molecular simulation: from algorithms to applications , 1996 .

[76]  T. Halgren Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94 , 1996, J. Comput. Chem..

[77]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules , 1995 .

[78]  Adriano Filipponi,et al.  The radial distribution function probed by X-ray absorption spectroscopy , 1994 .

[79]  Mark E. Tuckerman,et al.  Reversible multiple time scale molecular dynamics , 1992 .

[80]  R. L. Henderson A uniqueness theorem for fluid pair correlation functions , 1974 .

[81]  R. G. Wenzel,et al.  Structure Factor and Radial Distribution Function for Liquid Argon at 85 °K , 1973 .

[82]  Aneesur Rahman,et al.  Correlations in the Motion of Atoms in Liquid Argon , 1964 .

[83]  B. Alder,et al.  Studies in Molecular Dynamics. I. General Method , 1959 .

[84]  M. Welling,et al.  Path Integral Stochastic Optimal Control for Sampling Transition Paths , 2022, ArXiv.

[85]  G. D. Fabritiis,et al.  TorchMD-NET: Equivariant Transformers for Neural Network based Molecular Potentials , 2022, ICLR.

[86]  Nathan J. Rebello,et al.  Simulate Time-integrated Coarse-grained Molecular Dynamics with Geometric Machine Learning , 2022, ArXiv.

[87]  S. Ji,et al.  Spherical Message Passing for 3D Molecular Graphs , 2022, ICLR.

[88]  Thomas F. Miller,et al.  UNiTE: Unitary N-body Tensor Equivariant Network with Applications to Quantum Chemistry , 2021, ArXiv.

[89]  O. Isayev,et al.  ANI-1: an extensible neural network potential with DFT accuracy at force fi eld computational cost † , 2017 .

[90]  J. Ponder,et al.  Force fields for protein simulations. , 2003, Advances in protein chemistry.

[91]  J. Crabbe,et al.  Molecular modelling: Principles and applications , 1997 .

[92]  D. W. Noid Studies in Molecular Dynamics , 1976 .