Utilizing Machine Learning for Efficient Parameterization of Coarse Grained Molecular Force Fields

We present a machine learning approach to automated force field development in Dissipative Particle Dynamics (DPD). The approach employs Bayesian optimization to parameterize a DPD force field against experimentally determined partition coefficients. The optimization process covers a discrete space of over 40,000,000 points, where each point represents the set of potentials that jointly form a force field. We find that Bayesian optimization is capable of reaching a force field of comparable performance to the the current state-of-the-art within 40 iterations. The best iteration during the optimization achieves an R2 of 0.78 and an RMSE of 0.63 log units on the training set of data, these metrics are maintained when a validation set is included, giving R2 of 0.8 and an RMSE of 0.65 log units. This work hence provides a proof-of-concept, expounding the utility of coupling automated and efficient global optimization with a top down data driven approach to force field parameterization. Compared to commonly employed alternative methods, Bayesian optimization offers global parameter searching and a low time to solution.

[1]  Antony J. Williams,et al.  In Silico Prediction of Physicochemical Properties of Environmental Chemicals Using Molecular Fingerprints and Machine Learning , 2017, J. Chem. Inf. Model..

[2]  Sven P. Jacobsson,et al.  Solubility of Organic Compounds in Water/Octanol Systems. A Expanded Ensemble Molecular Dynamics Simulation Study of log P Parameters , 2001 .

[3]  J. Sangster Octanol-Water Partition Coefficients: Fundamentals and Physical Chemistry , 1997 .

[4]  J. Behler Perspective: Machine learning potentials for atomistic simulations. , 2016, The Journal of chemical physics.

[5]  Nicodemo Di Pasquale,et al.  Geometry Optimization with Machine Trained Topological Atoms , 2017, Scientific Reports.

[6]  Ioannis G. Economou,et al.  Prediction of the n‐hexane/water and 1‐octanol/water partition coefficients for environmentally relevant compounds using molecular simulation , 2012 .

[7]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[8]  P. Popelier Molecular simulation by knowledgeable quantum atoms , 2016 .

[9]  M. Tuckerman Ab initio molecular dynamics: basic concepts, current trends and novel applications , 2002 .

[10]  Abdullah Al Mamun,et al.  Balancing exploration and exploitation with adaptive variation for evolutionary multi-objective optimization , 2009, Eur. J. Oper. Res..

[11]  Paul L A Popelier,et al.  Machine Learning of Dynamic Electron Correlation Energies from Topological Atoms. , 2018, Journal of chemical theory and computation.

[12]  J S Smith,et al.  ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost , 2016, Chemical science.

[13]  Journal of Chemical Physics , 1932, Nature.

[14]  Pengyu Ren,et al.  Automation of AMOEBA polarizable force field parameterization for small molecules , 2012, Theoretical Chemistry Accounts.

[15]  Berend Smit,et al.  Simulating the self-assembly of model membranes , 1999 .

[16]  Benoît Roux,et al.  AUTOMATED FORCE FIELD PARAMETERIZATION FOR NON-POLARIZABLE AND POLARIZABLE ATOMIC MODELS BASED ON AB INITIO TARGET DATA. , 2013, Journal of chemical theory and computation.

[17]  N el Tayar,et al.  Percutaneous penetration of drugs: a quantitative structure-permeability relationship study. , 1991, Journal of pharmaceutical sciences.

[18]  Edward O. Pyzer-Knapp,et al.  Dynamic Control of Explore/Exploit Trade-Off In Bayesian Optimization , 2018, Advances in Intelligent Systems and Computing.

[19]  G. Karniadakis,et al.  A comparative study of coarse-graining methods for polymeric fluids: Mori-Zwanzig vs. iterative Boltzmann inversion vs. stochastic parametric optimization. , 2016, The Journal of chemical physics.

[20]  Alexander D. MacKerell,et al.  Automation of the CHARMM General Force Field (CGenFF) II: Assignment of Bonded Parameters and Partial Atomic Charges , 2012, J. Chem. Inf. Model..

[21]  Robert E. Rudd,et al.  COARSE-GRAINED MOLECULAR DYNAMICS AND THE ATOMIC LIMIT OF FINITE ELEMENTS , 1998 .

[22]  Gerd Maurer,et al.  Partition coefficients for environmentally important, multifunctional organic compounds in hexane + water , 1998 .

[23]  R. D. Groot,et al.  Mesoscopic simulation of cell membrane damage, morphology change and rupture by nonionic surfactants. , 2001, Biophysical journal.

[24]  P. Malfreyt,et al.  Development of DPD coarse-grained models: From bulk to interfacial properties. , 2016, The Journal of chemical physics.

[25]  W. L. Jorgensen,et al.  Development and Testing of the OPLS All-Atom Force Field on Conformational Energetics and Properties of Organic Liquids , 1996 .

[26]  John B. O. Mitchell,et al.  Predicting Melting Points of Organic Molecules: Applications to Aqueous Solubility Prediction Using the General Solubility Equation , 2015, Molecular informatics.

[27]  K. Müller,et al.  Machine Learning Predictions of Molecular Properties: Accurate Many-Body Potentials and Nonlocality in Chemical Space , 2015, The journal of physical chemistry letters.

[28]  Alexander D. MacKerell,et al.  Automation of the CHARMM General Force Field (CGenFF) I: Bond Perception and Atom Typing , 2012, J. Chem. Inf. Model..

[29]  Helgi I Ingólfsson,et al.  The power of coarse graining in biomolecular simulations , 2013, Wiley interdisciplinary reviews. Computational molecular science.

[30]  Dirk Reith,et al.  Deriving effective mesoscale potentials from atomistic simulations , 2002, J. Comput. Chem..

[31]  Car,et al.  Unified approach for molecular dynamics and density-functional theory. , 1985, Physical review letters.

[32]  Richard L. Anderson,et al.  Dissipative particle dynamics: Systematic parametrization using water-octanol partition coefficients. , 2017, The Journal of chemical physics.

[33]  Alain Dequidt,et al.  Bayesian parametrization of coarse-grain dissipative dynamics models. , 2015, The Journal of chemical physics.

[34]  Richard L. Anderson,et al.  Challenge to Reconcile Experimental Micellar Properties of the CnEm Nonionic Surfactant Family. , 2019, The journal of physical chemistry. B.

[35]  J. Sangster,et al.  Octanol‐Water Partition Coefficients of Simple Organic Compounds , 1989 .

[36]  Gregory A Voth,et al.  A multiscale coarse-graining method for biomolecular systems. , 2005, The journal of physical chemistry. B.

[37]  Hal Daumé,et al.  A Bayesian statistics approach to multiscale coarse graining. , 2008, The Journal of chemical physics.

[38]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[39]  R. Nagarajan,et al.  Molecular Packing Parameter and Surfactant Self-Assembly: The Neglected Role of the Surfactant Tail† , 2002 .

[40]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[41]  Rohit V Pappu,et al.  CAMELOT: A machine learning approach for coarse-grained simulations of aggregation of block-copolymeric protein sequences. , 2015, The Journal of chemical physics.

[42]  Peter A. Kollman,et al.  AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules , 1995 .

[43]  Laura K. Schnackenberg,et al.  Whole-Molecule Calculation of Log P Based on Molar Volume, Hydrogen Bonds, and Simulated 13C NMR Spectra , 2005, J. Chem. Inf. Model..

[44]  Berend Smit,et al.  Mesoscopic models of biological membranes , 2006 .

[45]  P. Español,et al.  Statistical Mechanics of Dissipative Particle Dynamics. , 1995 .

[46]  Edward O. Pyzer-Knapp,et al.  Efficient and Scalable Batch Bayesian Optimization Using K-Means , 2018, ArXiv.

[47]  Richard L. Anderson,et al.  Micelle Formation in Alkyl Sulfate Surfactants Using Dissipative Particle Dynamics. , 2018, Journal of chemical theory and computation.

[48]  Joseph Gomes,et al.  Building a More Predictive Protein Force Field: A Systematic and Reproducible Route to AMBER-FB15. , 2017, The journal of physical chemistry. B.

[49]  R. Kondor,et al.  Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. , 2009, Physical review letters.

[50]  Kurt Kremer,et al.  Simulation of polymer melts. I. Coarse‐graining procedure for polycarbonates , 1998 .

[51]  K. Müller,et al.  Fast and accurate modeling of molecular atomization energies with machine learning. , 2011, Physical review letters.

[52]  Matthias Rupp,et al.  Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach. , 2015, Journal of chemical theory and computation.

[53]  Vijay S Pande,et al.  Building Force Fields: An Automatic, Systematic, and Reproducible Approach. , 2014, The journal of physical chemistry letters.

[54]  Antony J. Williams,et al.  OPERA models for predicting physicochemical properties and environmental fate endpoints , 2018, Journal of Cheminformatics.

[55]  P. Ruelle The n-octanol and n-hexane/water partition coefficient of environmentally relevant chemicals predicted from the mobile order and disorder (MOD) thermodynamics. , 2000, Chemosphere.

[56]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[57]  M. Vincent,et al.  The Transferability of Topologically Partitioned Electron Correlation Energies in Water Clusters. , 2017, Chemphyschem : a European journal of chemical physics and physical chemistry.

[58]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[59]  Ulf Norinder,et al.  Experimental and Computational Screening Models for Prediction of Aqueous Drug Solubility , 2002, Pharmaceutical Research.

[60]  D. Tieleman,et al.  The MARTINI force field: coarse grained model for biomolecular simulations. , 2007, The journal of physical chemistry. B.

[61]  Jie Chen,et al.  Optimal Contraction Theorem for Exploration–Exploitation Tradeoff in Search and Optimization , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.