ExTASY: Scalable and flexible coupling of MD simulations and advanced sampling techniques

For many macromolecular systems the accurate sampling of the relevant regions on the potential energy surface cannot be obtained by a single, long Molecular Dynamics (MD) trajectory. New approaches are required to promote more efficient sampling. We present the design and implementation of the Extensible Toolkit for Advanced Sampling and analYsis (Ex-TASY) for building and executing advanced sampling workflows on HPC systems. ExTASY provides Python based “templated scripts” that interface to an interoperable and high-performance pilot-based run time system, which abstracts the complexity of managing multiple simulations. ExTASY supports the use of existing highly-optimised parallel MD code and their coupling to analysis tools based upon collective coordinates which do not require a priori knowledge of the system to bias. We describe two workflows which both couple large “ensembles” of relatively short MD simulations with analysis tools to automatically analyse the generated trajectories and identify molecular conformational structures that will be used on-the-fly as new starting points for further “simulation-analysis” iterations. One of the workflows leverages the Locally Scaled Diffusion Maps technique; the other makes use of Complementary Coordinates techniques to enhance sampling and generate start-points for the next generation of MD simulations. We show that the ExTASY tools have been deployed on a range of HPC systems including ARCHER (Cray CX30), Blue Waters (Cray XE6/XK7), and Stampede (Linux cluster), and that good strong scaling can be obtained up to 1000s of MD simulations, independent of the size of each simulation. We discuss how ExTASY can be easily extended or modified by end-users to build their own workflows, and ongoing work to improve the usability and robustness of ExTASY.

[1]  G. de Fabritiis,et al.  Complete reconstruction of an enzyme-inhibitor binding process by molecular dynamics simulations , 2011, Proceedings of the National Academy of Sciences.

[2]  Laxmikant V. Kalé,et al.  Scalable molecular dynamics with NAMD , 2005, J. Comput. Chem..

[3]  Jia-Cherng Horng,et al.  Rapid Cooperative Two-state Folding of a Miniature α–β Protein and Design of a Thermostable Variant , 2003 .

[4]  M. Maggioni,et al.  Determination of reaction coordinates via locally scaled diffusion map. , 2011, The Journal of chemical physics.

[5]  F. Noé,et al.  Protein conformational plasticity and complex ligand-binding kinetics explored by atomistic simulations and Markov models , 2015, Nature Communications.

[6]  Levi C. T. Pierce,et al.  Routine Access to Millisecond Time Scale Events with Accelerated Molecular Dynamics , 2012, Journal of chemical theory and computation.

[7]  Gregor von Laszewski,et al.  Using XDMoD to facilitate XSEDE operations, planning and analysis , 2013, XSEDE.

[8]  John L. Klepeis,et al.  Anton, a special-purpose machine for molecular dynamics simulation , 2007, ISCA '07.

[9]  J. Preto,et al.  Fast recovery of free energy landscapes via diffusion-map-directed molecular dynamics. , 2014, Physical chemistry chemical physics : PCCP.

[10]  Berk Hess,et al.  GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers , 2015 .

[11]  Stefano de Gironcoli,et al.  QUANTUM ESPRESSO: a modular and open-source software project for quantum simulations of materials , 2009, Journal of physics. Condensed matter : an Institute of Physics journal.

[12]  Makoto Taiji,et al.  MDGRAPE-4: a special-purpose computer system for molecular dynamics simulations , 2014, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[13]  Kyle A. Beauchamp,et al.  Molecular simulation of ab initio protein folding for a millisecond folder NTL9(1-39). , 2010, Journal of the American Chemical Society.

[14]  Massimiliano Bonomi,et al.  PLUMED 2: New feathers for an old bird , 2013, Comput. Phys. Commun..

[15]  Shuo Gu,et al.  Quantitatively Characterizing the Ligand Binding Mechanisms of Choline Binding Protein Using Markov State Model Analysis , 2014, PLoS Comput. Biol..

[16]  Ying Zhang,et al.  Computational studies on self-assembled paclitaxel structures: templates for hierarchical block copolymer assemblies and sustained drug release. , 2009, Biomaterials.

[17]  Nancy Wilkins-Diehr Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery , 2013 .

[18]  Ross C. Walker,et al.  An overview of the Amber biomolecular simulation package , 2013 .

[19]  Zhiyong Zhang,et al.  Simulating Large-Scale Conformational Changes of Proteins by Accelerating Collective Motions Obtained from Principal Component Analysis. , 2014, Journal of chemical theory and computation.

[20]  Robert Soliva,et al.  Molecular Dynamics Studies of DNA A-Tract Structure and Flexibility , 1999 .

[21]  J. Andrew McCammon,et al.  Molecular Dynamics of Acetylcholinesterase Dimer Complexed with Tacrine , 1997 .

[22]  Charles A Laughton,et al.  COCO: A simple tool to enrich the representation of conformational variability in NMR structures , 2009, Proteins.

[23]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[24]  Peter M. Kasson,et al.  Copernicus: A new paradigm for parallel adaptive molecular dynamics , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[25]  R. Dror,et al.  How Fast-Folding Proteins Fold , 2011, Science.

[26]  Shantenu Jha,et al.  Executing dynamic heterogeneous workloads on Blue Waters with RADICAL-Pilot , 2016 .

[27]  Massimiliano Bonomi,et al.  Metadynamics , 2019, ioChem-BD Computational Chemistry Datasets.

[28]  C. Schütte,et al.  Supplementary Information for “ Constructing the Equilibrium Ensemble of Folding Pathways from Short Off-Equilibrium Simulations ” , 2009 .

[29]  Charles Moulinec,et al.  Performance of Parallel IO on ARCHER , 2015 .

[30]  Yasuteru Shigeta,et al.  Simple, yet powerful methodologies for conformational sampling of proteins. , 2015, Physical chemistry chemical physics : PCCP.

[31]  Shantenu Jha,et al.  Ensemble Toolkit: Scalable and Flexible Execution of Ensembles of Tasks , 2016, 2016 45th International Conference on Parallel Processing (ICPP).

[32]  Y. Sugita,et al.  Replica-exchange molecular dynamics method for protein folding , 1999 .

[33]  Steve Plimpton,et al.  Fast parallel algorithms for short-range molecular dynamics , 1993 .

[34]  Hao Wu,et al.  Multiensemble Markov models of molecular thermodynamics and kinetics , 2016, Proceedings of the National Academy of Sciences.

[35]  Frank Noé,et al.  Markov state models of biomolecular conformational dynamics. , 2014, Current opinion in structural biology.

[36]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .