Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials

Efficiently creating a concise but comprehensive data set for training machine-learned interatomic potentials (MLIPs) is an under-explored problem. Active learning (AL), which uses either biased or unbiased molecular dynamics (MD) simulations to generate candidate pools, aims to address this objective. Existing biased and unbiased MD simulations, however, are prone to miss either rare events or extrapolative regions -- areas of the configurational space where unreliable predictions are made. Simultaneously exploring both regions is necessary for developing uniformly accurate MLIPs. In this work, we demonstrate that MD simulations, when biased by the MLIP's energy uncertainty, effectively capture extrapolative regions and rare events without the need to know \textit{a priori} the system's transition temperatures and pressures. Exploiting automatic differentiation, we enhance bias-forces-driven MD simulations by introducing the concept of bias stress. We also employ calibrated ensemble-free uncertainties derived from sketched gradient features to yield MLIPs with similar or better accuracy than ensemble-based uncertainty methods at a lower computational cost. We use the proposed uncertainty-driven AL approach to develop MLIPs for two benchmark systems: alanine dipeptide and MIL-53(Al). Compared to MLIPs trained with conventional MD simulations, MLIPs trained with the proposed data-generation method more accurately represent the relevant configurational space for both atomic systems.

[1]  Konstantin Gubaev,et al.  Performance of two complementary machine-learned potentials in modelling chemically complex systems , 2023, npj Computational Materials.

[2]  A. Lunghi,et al.  Efficient generation of stable linear machine-learning force fields with uncertainty-aware active learning , 2023, Machine Learning: Science and Technology.

[3]  Taufeq Mohammed Razakh,et al.  Allegro-Legato: Scalable, Fast, and Robust Neural-Network Quantum Molecular Dynamics via Sharpness-Aware Minimization , 2023, ISC.

[4]  Richard A. Messerly,et al.  Uncertainty-driven dynamics for active learning of interatomic potentials , 2023, Nature Computational Science.

[5]  T. Verstraelen,et al.  Machine learning potentials for metal-organic frameworks using an incremental learning approach , 2023, npj Computational Materials.

[6]  Simon L. Batzner,et al.  Fast Uncertainty Estimates in Deep Learning Interatomic Potentials , 2022, The Journal of chemical physics.

[7]  Cas van der Oord,et al.  Hyperactive learning for data-driven interatomic potentials , 2022, npj computational materials.

[8]  Zachary W. Ulissi,et al.  Robust and scalable uncertainty estimation with conformal prediction for machine-learned interatomic potentials , 2022, Mach. Learn. Sci. Technol..

[9]  David Holzmüller,et al.  A Framework and Benchmark for Deep Batch Active Learning for Regression , 2022, J. Mach. Learn. Res..

[10]  B. Kozinsky,et al.  Uncertainty-aware molecular dynamics from Bayesian active learning for phase transformations and thermal transport in SiC , 2022, npj Computational Materials.

[11]  P. Pernot The long road to calibrated prediction uncertainty in computational chemistry. , 2022, The Journal of chemical physics.

[12]  David Holzmuller,et al.  Fast and Sample-Efficient Interatomic Neural Network Potentials for Molecules and Materials Based on Gaussian Moments. , 2021, Journal of chemical theory and computation.

[13]  Anastasios Nikolas Angelopoulos,et al.  A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification , 2021, ArXiv.

[14]  Alán Aspuru-Guzik,et al.  Machine-learned potentials for next-generation matter simulations , 2021, Nature Materials.

[15]  Michael Gastegger,et al.  Equivariant message passing for the prediction of tensorial properties and molecular spectra , 2021, ICML.

[16]  Rafael Gómez-Bombarelli,et al.  Differentiable sampling of molecular geometries with uncertainty-based adversarial attacks , 2021, Nature Communications.

[17]  Jonathan P. Mailoa,et al.  E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials , 2021, Nature Communications.

[18]  Seungwu Han,et al.  Metadynamics sampling in atomic environment space for collecting training data for machine learning potentials , 2020, npj Computational Materials.

[19]  Toni Giorgino,et al.  TorchMD: A Deep Learning Framework for Molecular Simulations , 2020, Journal of chemical theory and computation.

[20]  M. Parrinello,et al.  Using metadynamics to build neural network potentials for reactive events: the case of urea decomposition in water , 2020, 2011.11455.

[21]  Klaus-Robert Müller,et al.  Machine Learning Force Fields , 2020, Chemical reviews.

[22]  Johannes Kästner,et al.  Gaussian Moments as Physically Inspired Molecular Descriptors for Accurate and Scalable Machine Learning Potentials. , 2020, Journal of chemical theory and computation.

[23]  O. Marsalek,et al.  Committee neural network potentials control generalization errors and enable active learning , 2020, The Journal of chemical physics.

[24]  Christian Plessl,et al.  CP2K: An electronic structure and molecular dynamics software package - Quickstep: Efficient and accurate electronic structure calculations. , 2020, The Journal of chemical physics.

[25]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[26]  Yarin Gal,et al.  BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning , 2019, NeurIPS.

[27]  He Huang,et al.  ff19SB: Amino-acid specific protein backbone parameters trained against quantum mechanics energy surfaces in solution. , 2019, Journal of chemical theory and computation.

[28]  Yaniv Romano,et al.  Conformalized Quantile Regression , 2019, NeurIPS.

[29]  Simon L. Batzner,et al.  On-the-fly active learning of interpretable Bayesian force fields for atomistic rare events , 2019, npj Computational Materials.

[30]  H. Kulik,et al.  A Quantitative Uncertainty Metric Controls Error in Neural Network-Driven Chemical Discovery , 2019 .

[31]  Ralf Drautz,et al.  Atomic cluster expansion for accurate and transferable interatomic potentials , 2019, Physical Review B.

[32]  E Weinan,et al.  Active Learning of Uniformly Accurate Inter-atomic Potentials for Materials Simulation , 2018, Physical Review Materials.

[33]  Daniel W. Davies,et al.  Machine learning for molecular and materials science , 2018, Nature.

[34]  Arthur Jacot,et al.  Neural Tangent Kernel: Convergence and Generalization in Neural Networks , 2018, NeurIPS.

[35]  Anders S. Christensen,et al.  Alchemical and structural distribution based representation for universal quantum machine learning. , 2017, The Journal of chemical physics.

[36]  Joshua V. Dillon,et al.  TensorFlow Distributions , 2017, ArXiv.

[37]  Silvio Savarese,et al.  Active Learning for Convolutional Neural Networks: A Core-Set Approach , 2017, ICLR.

[38]  Michael Walter,et al.  The atomic simulation environment-a Python library for working with atoms. , 2017, Journal of physics. Condensed matter : an Institute of Physics journal.

[39]  Klaus-Robert Müller,et al.  SchNet: A continuous-filter convolutional neural network for modeling quantum interactions , 2017, NIPS.

[40]  J. Behler,et al.  Machine learning molecular dynamics for the simulation of infrared spectra , 2017, Chemical science.

[41]  Kenji Doya,et al.  Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning , 2017, Neural Networks.

[42]  Alexander V. Shapeev,et al.  Active learning of linearly parametrized interatomic potentials , 2016, 1611.09346.

[43]  J S Smith,et al.  ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost , 2016, Chemical science.

[44]  Alessandro Rinaldo,et al.  Distribution-Free Predictive Inference for Regression , 2016, Journal of the American Statistical Association.

[45]  Nongnuch Artrith,et al.  An implementation of artificial neural-network potentials for atomistic materials simulations: Performance for TiO2 , 2016 .

[46]  Alexander V. Shapeev,et al.  Moment Tensor Potentials: A Class of Systematically Improvable Interatomic Potentials , 2015, Multiscale Model. Simul..

[47]  Matthias Scheffler,et al.  All-electron formalism for total energy strain derivatives and stress tensor components for numeric atom-centered orbitals , 2015, Comput. Phys. Commun..

[48]  Zhenwei Li,et al.  Molecular dynamics with on-the-fly machine learning of quantum-mechanical forces. , 2015, Physical review letters.

[49]  David P. Woodruff Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..

[50]  R. Kondor,et al.  On representing chemical environments , 2012, 1209.3140.

[51]  J. Moussa Comment on "Fast and accurate modeling of molecular atomization energies with machine learning". , 2012, Physical review letters.

[52]  Jan Flusser,et al.  Tensor Method for Constructing 3D Moment Invariants , 2011, CAIP.

[53]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[54]  S. Grimme,et al.  A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. , 2010, The Journal of chemical physics.

[55]  R. Kondor,et al.  Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. , 2009, Physical review letters.

[56]  M. Parrinello,et al.  Well-tempered metadynamics: a smoothly converging and tunable free-energy method. , 2008, Physical review letters.

[57]  Michele Parrinello,et al.  Generalized neural-network representation of high-dimensional potential-energy surfaces. , 2007, Physical review letters.

[58]  A. Laio,et al.  Escaping free-energy minima , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[59]  Melchionna Constrained systems and statistical distribution , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[60]  C. Dellago,et al.  Reaction coordinates of biomolecular isomerization. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[61]  Michele Parrinello,et al.  A hybrid Gaussian and plane wave density functional scheme , 1997 .

[62]  Burke,et al.  Generalized Gradient Approximation Made Simple. , 1996, Physical review letters.

[63]  Andrew E. Torda,et al.  Local elevation: A method for improving the searching properties of molecular dynamics simulation , 1994, J. Comput. Aided Mol. Des..

[64]  G. Ciccotti,et al.  Hoover NPT dynamics for systems varying in shape and size , 1993 .

[65]  David Holzmüller,et al.  Exploring Chemical and Conformational Spaces by Batch Mode Deep Active Learning , 2022, Digital Discovery.

[66]  Quoc V. Le,et al.  Searching for Activation Functions , 2018, arXiv.

[67]  Wolfhard Janke,et al.  Monte Carlo Simulations in Statistical Physics - From Basic Principles to Advanced Applications , 2012 .

[68]  Lutz Prechelt,et al.  Early Stopping-But When? , 1996, Neural Networks: Tricks of the Trade.

[69]  J. Bentley,et al.  Quad Trees: A Data Structure for Retrieval on Composite Keys , 1974, Acta Informatica.