Community Resources for Enabling Research in Distributed Scientific Workflows

A significant amount of recent research in scientific workflows aims to develop new techniques, algorithms and systems that can overcome the challenges of efficient and robust execution of ever larger workflows on increasingly complex distributed infrastructures. Since the infrastructures, systems and applications are complex, and their behavior is difficult to reproduce using physical experiments, much of this research is based on simulation. However, there exists a shortage of realistic datasets and tools that can be used for such studies. In this paper we describe a collection of tools and data that have enabled research in new techniques, algorithms, and systems for scientific workflows. These resources include: 1) execution traces of real workflow applications from which workflow and system characteristics such as resource usage and failure profiles can be extracted, 2) a synthetic workflow generator that can produce realistic synthetic workflows based on profiles extracted from execution traces, and 3) a simulator framework that can simulate the execution of synthetic workflows on realistic distributed infrastructures. This paper describes how we have used these resources to investigate new techniques for efficient and robust workflow execution, as well as to provide improvements to the Pegasus Workflow Management System or other workflow tools. Our goal in describing these resources is to share them with other researchers in the workflow research community.

[1]  Rajkumar Buyya,et al.  A Dynamic Critical Path Algorithm for Scheduling Scientific Workflow Applications on Global Grids , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[2]  Douglas Thain,et al.  Toward fine-grained online task characteristics estimation in scientific workflows , 2013, WORKS@SC.

[3]  Gábor Terstyánszky,et al.  Exploring Workflow Interoperability for Neuroimage Analysis on the SHIWA Platform , 2013, Journal of Grid Computing.

[4]  C. Broeck,et al.  Laser Interferometer Gravitational-wave Observatory , 2012 .

[5]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[6]  Miron Livny,et al.  Pegasus, a workflow management system for science automation , 2015, Future Gener. Comput. Syst..

[8]  Ian J. Taylor,et al.  Workflows and e-Science: An overview of workflow system features and capabilities , 2009, Future Gener. Comput. Syst..

[9]  Ann L. Chervenak,et al.  Characterizing and profiling scientific workflows , 2013, Future Gener. Comput. Syst..

[10]  Ben Taskar,et al.  Exploring repositories of scientific workflows , 2010, Wands '10.

[11]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[12]  Rainer Weiss,et al.  LASER INTERFEROMETER GRAVITATIONAL WAVE OBSERVATORY - LIGO - , 2005 .

[13]  Alexandru Iosup,et al.  The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[14]  Rizos Sakellariou,et al.  Imbalance optimization in scientific workflows , 2013, ICS '13.

[15]  Ulf Leser,et al.  DynamicCloudSim: simulating heterogeneity in computational clouds , 2013, SWEET '13.

[16]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[17]  Ewa Deelman,et al.  Fault Tolerant Clustering in Scientific Workflows , 2012, 2012 IEEE Eighth World Congress on Services.

[18]  Rizos Sakellariou,et al.  Balanced Task Clustering in Scientific Workflows , 2013, 2013 IEEE 9th International Conference on e-Science.

[19]  Michael Wilde,et al.  Kickstarting remote applications , 2006 .

[20]  Andrei Tchernykh,et al.  A Grid simulation framework to study advance scheduling strategies for complex workflow applications , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[21]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[22]  Tristan Glatard,et al.  A Science-Gateway Workload Archive to Study Pilot Jobs, User Activity, Bag of Tasks, Task Sub-steps, and Workflow Executions , 2012, Euro-Par Workshops.

[23]  Muhammad Ali Amer,et al.  Evaluating Workflow Tools with SDAG , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[24]  Ewa Deelman,et al.  Bringing Scientific Workflow to the Masses via Pegasus and HUBzero , 2013, IWSG.

[25]  Ewa Deelman,et al.  WorkflowSim: A toolkit for simulating scientific workflows in distributed environments , 2012, 2012 IEEE 8th International Conference on E-Science.

[26]  Ewa Deelman,et al.  Workflow overhead analysis and optimizations , 2011, WORKS '11.

[27]  J. Tao,et al.  A broker-based framework for multi-cloud workflows , 2013, MultiCloud '13.

[28]  Carole A. Goble,et al.  On specifying and sharing scientific workflow optimization results using research objects , 2013, WORKS@SC.

[29]  Alexandru Iosup,et al.  The Grid Workloads Archive , 2008, Future Gener. Comput. Syst..

[30]  Daniel S. Katz,et al.  Montage: a grid-enabled engine for delivering custom science-grade mosaics on demand , 2004, SPIE Astronomical Telescopes + Instrumentation.

[31]  Hugues Benoit-Cattin,et al.  Simulating Application Workflows and Services Deployed on the European Grid Infrastructure , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[32]  C. Kesselman,et al.  CyberShake: A Physics-Based Seismic Hazard Model for Southern California , 2011 .

[33]  Simone A. Ludwig,et al.  Evaluating Workflow Trust Using Hidden Markov Modeling and Provenance Data , 2013 .