PaPaS: A Portable, Lightweight, and Generic Framework for Parallel Parameter Studies

The current landscape of scientific research is widely based on modeling and simulation, typically with complexity in the simulation's flow of execution and parameterization properties. Execution flows are not necessarily straightforward since they may need multiple processing tasks and iterations. Furthermore, parameter and performance studies are common approaches used to characterize a simulation, often requiring traversal of a large parameter space. High-performance computers offer practical resources at the expense of users handling the setup, submission, and management of jobs. This work presents the design of PaPaS, a portable, lightweight, and generic workflow framework for conducting parallel parameter and performance studies. Workflows are defined using parameter files based on keyword-value pairs syntax, thus removing from the user the overhead of creating complex scripts to manage the workflow. A parameter set consists of any combination of environment variables, files, partial file contents, and command line arguments. PaPaS is being developed in Python 3 with support for distributed parallelization using SSH, batch systems, and C++ MPI. The PaPaS framework will run as user processes, and can be used in single/multi-node and multi-tenant computing systems. An example simulation using the BehaviorSpace tool from NetLogo and a matrix multiply using OpenMP are presented as parameter and performance studies, respectively. The results demonstrate that the PaPaS framework offers a simple method for defining and managing parameter studies, while increasing resource utilization.

[1]  R. Weisberg A-N-D , 2011 .

[2]  Edward Walker,et al.  Challenges in executing large parameter sweep studies across widely distributed computing environments , 2007, CLADE '07.

[3]  Sven Rahmann,et al.  Genome analysis , 2022 .

[4]  Nobuyasu Ito,et al.  An open-source job management framework for parameter-space exploration: OACIS , 2018, ArXiv.

[5]  Oleg Sukhoroslov,et al.  A Generic Web Service for Running Parameter Sweep Experiments in Distributed Computing Environment , 2015 .

[6]  Jeff Yu Lei,et al.  ACTS: A Combinatorial Test Generation Tool , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[7]  Carole A. Goble,et al.  The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud , 2013, Nucleic Acids Res..

[8]  Daniel S. Katz,et al.  Swift/T: scalable data flow programming for many-task applications , 2013, PPoPP '13.

[9]  David Abramson,et al.  Nimrod/K: Towards massively parallel dynamic Grid workflows , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[10]  William Rand,et al.  An Introduction to Agent-Based Modeling: Modeling Natural, Social, and Engineered Complex Systems with NetLogo , 2015 .

[11]  Jack J. Dongarra,et al.  Collecting Performance Data with PAPI-C , 2009, Parallel Tools Workshop.

[12]  Jacek Kitowski,et al.  Parameter studies on heterogeneous computing infrastructures with the Scalarm platform , 2016, 2016 International Conference on High Performance Computing & Simulation (HPCS).

[13]  Edmon Begoli,et al.  Evaluating Text Analytic Frameworks for Mental Health Surveillance , 2018, 2018 IEEE 34th International Conference on Data Engineering Workshops (ICDEW).

[14]  Rupak Biswas,et al.  An Advanced User Interface Approach for Complex Parameter Study Process Specification on the Information Power Grid , 2000, GRID.

[15]  Rajkumar Buyya,et al.  Workflow scheduling algorithms for grid computing , 2008 .

[16]  Miron Livny,et al.  Pegasus, a workflow management system for science automation , 2015, Future Gener. Comput. Syst..

[17]  Jack J. Dongarra,et al.  Scheduling workflow applications on processors with different capabilities , 2006, Future Gener. Comput. Syst..

[18]  C.B.Ries,et al.  ComsolGrid - A framework for performing large-scale parameter studies using Comsol Multiphysics and Berkeley Open Infrastructure for Network Computing (BOINC) , 2010 .

[19]  Fumihiko Ino,et al.  A parallel scheme for accelerating parameter sweep applications on a GPU , 2014, Concurr. Comput. Pract. Exp..

[20]  Robert J. Fowler,et al.  Workflows for performance evaluation and tuning , 2008, 2008 IEEE International Conference on Cluster Computing.

[21]  Christin Seifert,et al.  Understanding the Influence of Hyperparameters on Text Embeddings for Text Classification Tasks , 2017, TPDL.

[22]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[23]  Eduardo Huedo,et al.  The Grid[Way] Job Template Manager, a tool for parameter sweeping , 2011, Comput. Phys. Commun..

[24]  Thomas Fahringer,et al.  ZENTURIO: a grid middleware-based tool for experiment management of parallel and distributed applications , 2004, J. Parallel Distributed Comput..

[25]  Lavanya Ramakrishnan,et al.  The future of scientific workflows , 2018, Int. J. High Perform. Comput. Appl..

[26]  David R. C. Hill,et al.  Declarative task delegation in OpenMOLE , 2010, 2010 International Conference on High Performance Computing & Simulation.

[27]  Christian Blum,et al.  Metaheuristics in combinatorial optimization: Overview and conceptual comparison , 2003, CSUR.

[28]  Gregor von Laszewski,et al.  Using XDMoD to facilitate XSEDE operations, planning and analysis , 2013, XSEDE.

[29]  Stefan Bruckner,et al.  Visual Parameter Space Analysis: A Conceptual Framework , 2014, IEEE Transactions on Visualization and Computer Graphics.

[30]  Péter Kacsuk,et al.  P-GRADE Portal: A generic workflow system to support user communities , 2011, Future Gener. Comput. Syst..

[31]  David D. Cox,et al.  Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms , 2013, SciPy.

[32]  M. Kotaki,et al.  Systematic parameter study for ultra-fine fiber fabrication via electrospinning process , 2005 .

[33]  Sebastien Rey-Coyrehourcq,et al.  OpenMOLE, a workflow engine specifically tailored for the distributed exploration of simulation models , 2013, Future Gener. Comput. Syst..

[34]  Oleg Sukhoroslov,et al.  Integration and Combined Use of Distributed Computing Resources with Everest , 2016 .

[35]  Marlon Dumas,et al.  UML Activity Diagrams as a Workflow Specification Language , 2001, UML.

[36]  Steven M. Gallo,et al.  Application kernels: HPC resources performance monitoring and variance analysis , 2015, Concurr. Comput. Pract. Exp..

[37]  David Abramson,et al.  High performance parametric modeling with Nimrod/G: killer application for the global grid? , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[38]  James C. Browne,et al.  Open XDMoD: A Tool for the Comprehensive Management of High-Performance Computing Resources , 2015, Computing in Science & Engineering.

[39]  Maurice Yarrow,et al.  A Comparison of Parameter Study Creation and Job Submission Tools , 2001 .

[40]  David Abramson,et al.  Scheduling Multiple Parameter Sweep Workflow Instances on the Grid , 2009, 2009 Fifth IEEE International Conference on e-Science.

[41]  Francine Berman,et al.  The AppLeS Parameter Sweep Template: User-Level Middleware for the Grid , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[42]  Rajkumar Buyya,et al.  GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for Grid computing , 2002, Concurr. Comput. Pract. Exp..

[43]  Katy Börner,et al.  XDMoD Value Analytics: A Tool for Measuring the Financial and Intellectual ROI of Your Campus Cyberinfrastructure Facilities , 2017, PEARC.

[44]  Carole A. Goble,et al.  Taverna, Reloaded , 2010, SSDBM.

[45]  W. M. P. V. D. Aalsta,et al.  YAWL : yet another workflow language , 2015 .

[46]  David Abramson,et al.  Economic models for resource management and scheduling in Grid computing , 2002, Concurr. Comput. Pract. Exp..

[47]  Scott Shenker,et al.  Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling , 2010, EuroSys '10.