Performance assessment of ensembles of in situ workflows under resource constraints

Scientific breakthroughs in biomolecular methods and improvements in hardware technology have shifted from a long‐running simulation to a large set of shorter simulations running simultaneously, called an ensemble. In an ensemble, simulations are usually coupled with analyses of data produced by the simulations. In situ methods can be used to analyze large volumes of data generated by scientific simulations at runtime (i.e., simulations and analyses are performed concurrently). In this work, we study the execution of ensemble‐based simulations paired with in situ analyses using in‐memory staging methods. Using an ensemble of molecular dynamics in situ workflows with multiple simulations and analyses, we first show that collecting traditional metrics such as makespan, instructions per cycle, memory usage, or cache miss ratio is not sufficient to characterize complex behaviors of ensembles. We propose a method to evaluate the performance of ensembles of workflows that captures multiple resource usage aspects: resource efficiency, resource allocation, and resource provisioning. Experimental results demonstrate that the proposed method can effectively distinguish the performance of different component placements in an ensemble with up to 32 ensemble members. By evaluating different co‐location scenarios, our proposed performance indicators demonstrate benefits of co‐locating simulation and coupled analyses within a compute node.

[1]  Ewa Deelman,et al.  Assessing Resource Provisioning and Allocation of Ensembles of In Situ Workflows , 2021, ICPP Workshops.

[2]  Trilce Estrada,et al.  A lightweight method for evaluating in situ workflow efficiency , 2020, J. Comput. Sci..

[3]  Trilce Estrada,et al.  A Novel Metric to Evaluate In Situ Workflows , 2020, ICCS.

[4]  Henry Hoffmann,et al.  SeeSAw: Optimizing Performance of In-Situ Analytics Applications under Power Constraints , 2020, 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[5]  Trilce Estrada,et al.  A survey of algorithms for transforming molecular dynamics data into metadata for in situ analytics based on machine learning methods , 2020, Philosophical Transactions of the Royal Society A.

[6]  S. A. Jacobs,et al.  Enabling machine learning-ready HPC ensembles with Merlin , 2019, Future Gener. Comput. Syst..

[7]  Ewa Deelman,et al.  Measuring the impact of burst buffers on data-intensive scientific workflows , 2019, Future Gener. Comput. Syst..

[8]  Daniel Mossé,et al.  Intelligent Colocation of Workloads for Enhanced Server Efficiency , 2019, 2019 31st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD).

[9]  Trilce Estrada,et al.  Characterizing In Situ and In Transit Analytics of Molecular Dynamics Simulations for Next-Generation Supercomputers , 2019, 2019 15th International Conference on eScience (eScience).

[10]  Rong Ge,et al.  Contention Aware Workload and Resource Co-Scheduling on Power-Bounded Systems , 2019, 2019 IEEE International Conference on Networking, Architecture and Storage (NAS).

[11]  Geoffrey C. Fox,et al.  Parallel performance of molecular dynamics trajectory analysis , 2019, Concurr. Comput. Pract. Exp..

[12]  Frank Noé,et al.  Porting Adaptive Ensemble Molecular Dynamics Workflows to the Summit Supercomputer , 2019, ISC Workshops.

[13]  John D. Leidel,et al.  Extreme Heterogeneity 2018 - Productive Computational Science in the Era of Extreme Heterogeneity: Report for DOE ASCR Workshop on Extreme Heterogeneity , 2018 .

[14]  Shantenu Jha,et al.  Adaptive Ensemble Biomolecular Applications at Scale , 2018, SN Computer Science.

[15]  Gregory A. Koenig,et al.  Rate-based thermal, power, and co-location aware resource management for heterogeneous data centers , 2018, J. Parallel Distributed Comput..

[16]  Fan Zhang,et al.  In‐memory staging and data‐centric task placement for coupled scientific simulation workflows , 2017, Concurr. Comput. Pract. Exp..

[17]  Patrick M. Widener,et al.  Understanding Performance Interference in Next-Generation HPC Systems , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[18]  Allen D. Malony,et al.  A Scalable Observation System for Introspection and In Situ Analytics , 2016, 2016 5th Workshop on Extreme-Scale Programming Tools (ESPT).

[19]  Rizos Sakellariou,et al.  A characterization of workflow management systems for extreme-scale applications , 2016, Future Gener. Comput. Syst..

[20]  Shantenu Jha,et al.  Using Pilot Systems to Execute Many Task Workloads on Supercomputers , 2015, JSSPP.

[21]  Daniel R. Roe,et al.  The Impact of Heterogeneous Computing on Workflows for Biomolecular Simulation and Analysis , 2015, Computing in Science & Engineering.

[22]  Daniel S. Terry,et al.  Transport domain unlocking sets the uptake rate of an aspartate transporter , 2015, Nature.

[23]  Klaus Schulten,et al.  Multiple-Replica Strategies for Free-Energy Calculations in NAMD: Multiple-Walker Adaptive Biasing Force and Walker Selection Rules. , 2014, Journal of chemical theory and computation.

[24]  Thomas W. Tucker,et al.  The Lightweight Distributed Metric Service: A Scalable Infrastructure for Continuous Monitoring of Large Scale Computing Systems and Applications , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[25]  Jarek Nabrzyski,et al.  Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[26]  Riccardo Chelli,et al.  Serial Generalized Ensemble Simulations of Biomolecules with Self-Consistent Determination of Weights. , 2012, Journal of chemical theory and computation.

[27]  E. Lindahl,et al.  Implementation of the CHARMM Force Field in GROMACS: Analysis of Protein Stability Effects from Correction Maps, Virtual Interaction Sites, and Water Models. , 2010, Journal of chemical theory and computation.

[28]  Allen D. Malony,et al.  The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..

[29]  Adam Liwo,et al.  In situ data analytics and indexing of protein trajectories , 2017, J. Comput. Chem..

[30]  Allen D. Malony,et al.  WOWMON: A Machine Learning-based Profiler for Self-adaptive Instrumentation of Scientific Workflows , 2016, ICCS.

[31]  Gregory A. Koenig,et al.  Modeling the Effects on Power and Performance from Memory Interference of Co-located Applications in Multicore Systems , 2014 .

[32]  Massimiliano Bonomi,et al.  Metadynamics , 2019, ioChem-BD Computational Chemistry Datasets.