Highly Interactive, Steered Scientific Workflows on HPC Systems: Optimizing Design Solutions

Scientific workflows are becoming increasingly important in high performance computing (HPC) settings, as the feasibility and appeal of many simultaneous heterogeneous tasks increases with increasing hardware capabilities. Currently no HPC-based workflow platform supports a dynamically adaptable workflow with interactive steering and analysis at run-time. Furthermore, for most workflow programs, compute resources are fixed for a given instance, resulting in a possible waste of expensive allocation resources when tasks are spawned and killed. Here we describe the design and testing of a run-time-interactive, adaptable, steered workflow tool capable of executing thousands of parallel tasks without an MPI programming model, using a database management system to facilitate task management through multiple live connections. We find that on the Oak Ridge Leadership Computing Facility pre-exascale Summit supercomputer it is possible to launch and interactively steer workflows with thousands of simultaneous tasks with negligible latency. For the case of particle simulation and analysis tasks that run for minutes to hours, this paradigm offers the prospect of a robust and efficient means to perform simulation-space exploration with on-the-fly analysis and adaptation.

[1]  Frank Noé,et al.  PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. , 2015, Journal of chemical theory and computation.

[2]  Erik Lindahl,et al.  Copernicus, a hybrid dataflow and peer-to-peer scientific computing platform for efficient large-scale ensemble sampling , 2017, Future Gener. Comput. Syst..

[3]  Eric J. Sorin,et al.  Exploring the helix-coil transition via all-atom equilibrium ensemble simulations. , 2005, Biophysical journal.

[4]  E. Vanden-Eijnden,et al.  String method for the study of rare events , 2002, cond-mat/0205527.

[5]  Lydia E Kavraki,et al.  Quantitative comparison of adaptive sampling methods for protein dynamics. , 2018, The Journal of chemical physics.

[6]  P. Hänggi,et al.  Reaction-rate theory: fifty years after Kramers , 1990 .

[7]  Shantenu Jha,et al.  Adaptive ensemble simulations of biomolecules. , 2018, Current opinion in structural biology.

[8]  Wei Chen,et al.  FireWorks: a dynamic workflow system designed for high‐throughput applications , 2015, Concurr. Comput. Pract. Exp..

[9]  Frank Noé,et al.  Markov models of molecular kinetics: generation and validation. , 2011, The Journal of chemical physics.

[10]  Rafael C. Bernardi,et al.  Enhanced sampling techniques in molecular dynamics simulations of biological systems. , 2015, Biochimica et biophysica acta.

[11]  Robert B. Ross,et al.  Supporting task-level fault-tolerance in HPC workflows by launching MPI jobs inside MPI jobs , 2017, WORKS@SC.

[12]  Frank Noé,et al.  An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation , 2014, Advances in Experimental Medicine and Biology.

[13]  Gerhard Hummer,et al.  Peptide folding kinetics from replica exchange molecular dynamics. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  A. Garcia,et al.  Chapter 5 Simulations of Temperature and Pressure Unfolding of Peptides and Proteins with Replica Exchange Molecular Dynamics , 2006 .

[15]  Mohammad M. Sultan,et al.  Optimized parameter selection reveals trends in Markov state models for protein folding. , 2016, The Journal of chemical physics.

[16]  R. Swendsen,et al.  THE weighted histogram analysis method for free‐energy calculations on biomolecules. I. The method , 1992 .

[17]  Anastasia Ailamaki,et al.  Scientific workflow management by database management , 1998, Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243).

[18]  Rommie E. Amaro,et al.  Ensemble Docking in Drug Discovery. , 2018, Biophysical journal.

[19]  Gerhard Hummer,et al.  Position-dependent diffusion coefficients and free energies from Bayesian analysis of equilibrium and replica molecular dynamics simulations , 2005 .

[20]  Daniel S. Katz,et al.  Swift/T: Large-Scale Application Composition via Distributed-Memory Dataflow Processing , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[21]  Benoît Roux,et al.  Conformational Flexibility of o-Phosphorylcholine and o-Phosphorylethanolamine: A Molecular Dynamics Study of Solvation Effects , 1994 .

[22]  Toni Giorgino,et al.  Identification of slow molecular order parameters for Markov model construction. , 2013, The Journal of chemical physics.

[23]  Vijay S. Pande,et al.  OpenMM 7: Rapid development of high performance algorithms for molecular dynamics , 2016, bioRxiv.

[24]  R. Kubo The fluctuation-dissipation theorem , 1966 .

[25]  Rizos Sakellariou,et al.  A characterization of workflow management systems for extreme-scale applications , 2016, Future Gener. Comput. Syst..

[26]  Jeremy C. Smith,et al.  Hierarchical analysis of conformational dynamics in biomolecules: transition networks of metastable states. , 2007, The Journal of chemical physics.

[27]  Frank Noé,et al.  Porting Adaptive Ensemble Molecular Dynamics Workflows to the Summit Supercomputer , 2019, ISC Workshops.

[28]  Hao Wu,et al.  Multiensemble Markov models of molecular thermodynamics and kinetics , 2016, Proceedings of the National Academy of Sciences.