Porting Adaptive Ensemble Molecular Dynamics Workflows to the Summit Supercomputer

Molecular dynamics (MD) simulations must take very small (femtosecond) integration steps in simulation-time to avoid numerical errors. Efficient use of parallel programming models and accelerators in state-of-the art MD programs now is pushing Moore’s limit for time-per-MD step. As a result, directly simulating timescales beyond milliseconds will not be attainable directly, even at exascale. However, concepts from statistical physics can be used to combine many parallel simulations to provide information about longer timescales and to adequately sample the simulation space, while preserving details about the dynamics of the system. Implementing such an approach requires a workflow program that allows adaptable steering of task assignments based on extensive statistical analysis of intermediate results. Here we report the implementation of such an adaptable workflow program to drive simulations on the Summit IBM Power System AC922, a pre-exascale supercomputer at the Oak Ridge Leadership Computing Facility (OLCF). We compare to experiences on Titan, Summit’s predecessor, report the performance of the workflow and its components, and describe the porting process. We find that using a workflow program managed by a Mongo database can provide the fault tolerance, scalable performance, task dispatch rate, and reconfigurability required for robust and portable implementation of ensemble simulations such as are used in enhanced-sampling molecular dynamics. This type of workflow generator can also be used to provide adaptive steering of ensemble simulations for other applications in addition to MD.

[1]  Wei Chen,et al.  FireWorks: a dynamic workflow system designed for high‐throughput applications , 2015, Concurr. Comput. Pract. Exp..

[2]  Frank Noé,et al.  PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. , 2015, Journal of chemical theory and computation.

[3]  Juliana Freire,et al.  Provenance and scientific workflows: challenges and opportunities , 2008, SIGMOD Conference.

[4]  Sathiamoorthy Manoharan,et al.  A performance comparison of SQL and NoSQL databases , 2013, 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM).

[5]  Lydia E Kavraki,et al.  Quantitative comparison of adaptive sampling methods for protein dynamics. , 2018, The Journal of chemical physics.

[6]  Matteo Turilli,et al.  Harnessing the Power of Many: Extensible Toolkit for Scalable Ensemble Applications , 2017, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[7]  Shantenu Jha,et al.  A Comprehensive Perspective on Pilot-Job Systems , 2015, ACM Comput. Surv..

[8]  R. Kubo The fluctuation-dissipation theorem , 1966 .

[9]  Ewa Deelman,et al.  Pegasus in the Cloud: Science Automation through Workflow Technologies , 2016, IEEE Internet Computing.

[10]  Vyas Ramasubramani,et al.  signac - A Simple Data Management Framework , 2016, ArXiv.

[11]  Zachary Parker,et al.  Comparing NoSQL MongoDB to an SQL DB , 2013, ACMSE '13.

[12]  Frank Noé,et al.  An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation , 2014, Advances in Experimental Medicine and Biology.

[13]  Shantenu Jha,et al.  RADICAL-Pilot: Scalable Execution of Heterogeneous and Dynamic Workloads on Supercomputers , 2015, ArXiv.

[14]  Carole A. Goble,et al.  The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud , 2013, Nucleic Acids Res..

[15]  Shantenu Jha,et al.  Ensemble Toolkit: Scalable and Flexible Execution of Ensembles of Tasks , 2016, 2016 45th International Conference on Parallel Processing (ICPP).

[16]  Toni Giorgino,et al.  Identification of slow molecular order parameters for Markov model construction. , 2013, The Journal of chemical physics.

[17]  R. Swendsen,et al.  THE weighted histogram analysis method for free‐energy calculations on biomolecules. I. The method , 1992 .

[18]  Miron Livny,et al.  Pegasus, a workflow management system for science automation , 2015, Future Gener. Comput. Syst..

[19]  Erik Lindahl,et al.  Copernicus, a hybrid dataflow and peer-to-peer scientific computing platform for efficient large-scale ensemble sampling , 2017, Future Gener. Comput. Syst..

[20]  E. Vanden-Eijnden,et al.  String method for the study of rare events , 2002, cond-mat/0205527.

[21]  Berk Hess,et al.  GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers , 2015 .

[22]  Sitalakshmi Venkatraman,et al.  SQL Versus NoSQL Movement with Big Data Analytics , 2016 .

[23]  Robert B. Ross,et al.  Supporting task-level fault-tolerance in HPC workflows by launching MPI jobs inside MPI jobs , 2017, WORKS@SC.

[24]  Eric J. Sorin,et al.  Exploring the helix-coil transition via all-atom equilibrium ensemble simulations. , 2005, Biophysical journal.

[25]  P. Hänggi,et al.  Reaction-rate theory: fifty years after Kramers , 1990 .

[26]  Vijay S. Pande,et al.  OpenMM 7: Rapid development of high performance algorithms for molecular dynamics , 2016, bioRxiv.

[27]  Arnold N. Tharrington,et al.  High-Performance Molecular Dynamics Simulation for Biological and Materials Sciences: Challenges of Performance Portability , 2018, 2018 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC).

[28]  Rommie E. Amaro,et al.  Ensemble Docking in Drug Discovery. , 2018, Biophysical journal.

[29]  Shantenu Jha,et al.  Adaptive ensemble simulations of biomolecules. , 2018, Current opinion in structural biology.

[30]  Shantenu Jha,et al.  Implementing Adaptive Ensemble Biomolecular Applications at Scale , 2018, ArXiv.

[31]  Bertram Ludäscher,et al.  Scientific workflow design 2.0: Demonstrating streaming data collections in Kepler , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[32]  Anastasia Ailamaki,et al.  Scientific workflow management by database management , 1998, Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243).

[33]  Marta Mattoso,et al.  Provenance of Dynamic Adaptations in User-Steered Dataflows , 2018, IPAW.

[34]  Gerhard Hummer,et al.  Position-dependent diffusion coefficients and free energies from Bayesian analysis of equilibrium and replica molecular dynamics simulations , 2005 .

[35]  Paul Messina,et al.  The Exascale Computing Project , 2017, Comput. Sci. Eng..

[36]  Ada Sedova,et al.  Using Compiler Directives for Performance Portability in Scientific Computing: Kernels from Molecular Simulation , 2018, WACCPD@SC.

[37]  Grant M. Rotskoff,et al.  Molecular simulation workflows as parallel algorithms: the execution engine of Copernicus, a distributed high-performance computing platform. , 2015, Journal of chemical theory and computation.

[38]  Gerhard Hummer,et al.  Peptide folding kinetics from replica exchange molecular dynamics. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[39]  A. Garcia,et al.  Chapter 5 Simulations of Temperature and Pressure Unfolding of Peptides and Proteins with Replica Exchange Molecular Dynamics , 2006 .

[40]  Frank Noé,et al.  Markov models of molecular kinetics: generation and validation. , 2011, The Journal of chemical physics.

[41]  Rafael C. Bernardi,et al.  Enhanced sampling techniques in molecular dynamics simulations of biological systems. , 2015, Biochimica et biophysica acta.

[42]  Frank Noé,et al.  OpenPathSampling: A Python Framework for Path Sampling Simulations. 1. Basics , 2018, bioRxiv.

[43]  Mohammad M. Sultan,et al.  Optimized parameter selection reveals trends in Markov state models for protein folding. , 2016, The Journal of chemical physics.

[44]  Shantenu Jha,et al.  ExTASY: Scalable and flexible coupling of MD simulations and advanced sampling techniques , 2016, 2016 IEEE 12th International Conference on e-Science (e-Science).

[45]  Daniel S. Katz,et al.  Swift/T: Large-Scale Application Composition via Distributed-Memory Dataflow Processing , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[46]  Benoît Roux,et al.  Conformational Flexibility of o-Phosphorylcholine and o-Phosphorylethanolamine: A Molecular Dynamics Study of Solvation Effects , 1994 .

[47]  Hao Wu,et al.  Multiensemble Markov models of molecular thermodynamics and kinetics , 2016, Proceedings of the National Academy of Sciences.