Idea Paper : Development of a Software Framework for Formalizing Forcefield Atom-Typing for Molecular Simulation

Forcefields are a crucial ingredient of Molecular Dynamics (MD) simulations, describing the types and parameters of interactions between the simulated particles. These parameter sets, however, are typically specific to the molecule in which the atoms appears, where within the molecule the atom is positioned, the phase or state point of the system, as well as the simulator tool in use. This makes choosing the correct parameter values a tedious and error prone task. Forcefield parameters, furthermore, are often hard to locate: some are published in scientific papers, others come with MD tools, often with no or ambiguous documentation on their applicability. In this paper, we present a framework that aims to solve this data management issue, proposing a common format for forcefields that is self-documenting with machine readable, declarative usage rules. We believe that processes and tools that are commonly used today in software development (e.g, unit testing, verification and validation, continuous integration, and version control) are, with proper infrastructure support, applicable to forcefield development, as well. The paper describes how such an infrastructure can tackle managing and evolving forcefields by the MD community, and proposes a way to encourage and incentivize involvement by the stakeholders.

[1]  H. Sun,et al.  COMPASS: An ab Initio Force-Field Optimized for Condensed-Phase ApplicationsOverview with Details on Alkane and Benzene Compounds , 1998 .

[2]  Stephen A. Cook,et al.  The complexity of theorem-proving procedures , 1971, STOC.

[3]  D. Frenkel,et al.  Prediction of absolute crystal-nucleation rate in hard-sphere colloids , 2001, Nature.

[4]  Ricardo Bicca de Alencastro,et al.  MKTOP: a program for automatic construction of molecular topologies , 2008 .

[5]  Chris Oostenbrink,et al.  A biomolecular force field based on the free enthalpy of hydration and solvation: The GROMOS force‐field parameter sets 53A5 and 53A6 , 2004, J. Comput. Chem..

[6]  William L. Jorgensen,et al.  Perfluoroalkanes: Conformational Analysis and Liquid-State Properties from ab Initio and Monte Carlo Calculations , 2001 .

[7]  Pramod C. Nair,et al.  An Automated Force Field Topology Builder (ATB) and Repository: Version 1.0. , 2011, Journal of chemical theory and computation.

[8]  J. Ilja Siepmann,et al.  Vapor–liquid equilibria of mixtures containing alkanes, carbon dioxide, and nitrogen , 2001 .

[9]  Alexander D. MacKerell,et al.  Development and current status of the CHARMM force field for nucleic acids , 2000, Biopolymers.

[10]  J. Ilja Siepmann,et al.  Transferable Potentials for Phase Equilibria. 1. United-Atom Description of n-Alkanes , 1998 .

[11]  Letizia Tanca,et al.  What you Always Wanted to Know About Datalog (And Never Dared to Ask) , 1989, IEEE Trans. Knowl. Data Eng..

[12]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[13]  U. Singh,et al.  A NEW FORCE FIELD FOR MOLECULAR MECHANICAL SIMULATION OF NUCLEIC ACIDS AND PROTEINS , 1984 .

[14]  Charles L. Brooks,et al.  MATCH: An atom‐typing toolset for molecular mechanics force fields , 2012, J. Comput. Chem..

[15]  P. Kollman,et al.  Automatic atom type and bond type perception in molecular mechanical calculations. , 2006, Journal of molecular graphics & modelling.

[16]  Robert P. Sheridan,et al.  PATTY: A Programmable Atom Typer and Language for Automatic Classification of Atoms in Molecular Databases. , 1994 .

[17]  J. Lloyd Foundations of Logic Programming , 1984, Symbolic Computation.

[18]  Jianzhong Li,et al.  Efficient Subgraph Matching on Billion Node Graphs , 2012, Proc. VLDB Endow..

[19]  A. W. Schüttelkopf,et al.  PRODRG: a tool for high-throughput crystallography of protein-ligand complexes. , 2004, Acta crystallographica. Section D, Biological crystallography.

[20]  J Tirado-Rives,et al.  Estimation of binding affinities for HEPT and nevirapine analogues with HIV-1 reverse transcriptase via Monte Carlo simulations. , 2001, Journal of medicinal chemistry.

[21]  Peter T. Cummings,et al.  Supercapacitor Capacitance Exhibits Oscillatory Behavior as a Function of Nanopore Size , 2011 .

[22]  J. W. Lloyd,et al.  Foundations of logic programming; (2nd extended ed.) , 1987 .

[23]  David Eppstein,et al.  The Polyhedral Approach to the Maximum Planar Subgraph Problem: New Chances for Related Problems , 1994, GD.

[24]  Berend Smit,et al.  Simulating the critical behaviour of complex fluids , 1993, Nature.

[25]  Alexander D. MacKerell,et al.  Automation of the CHARMM General Force Field (CGenFF) I: Bond Perception and Atom Typing , 2012, J. Chem. Inf. Model..

[26]  W. L. Jorgensen,et al.  Development and Testing of the OPLS All-Atom Force Field on Conformational Energetics and Properties of Organic Liquids , 1996 .

[27]  William F. Clocksin,et al.  Programming in Prolog , 1987, Springer Berlin Heidelberg.

[28]  Sharon C. Glotzer,et al.  Disordered, quasicrystalline and crystalline phases of densely packed tetrahedra , 2009, Nature.

[29]  Peter T. Cummings,et al.  Web- and Cloud-based Software Infrastructure for Materials Design , 2014, ICCS.