Scientific workflow management systems and workflow patterns

Scientific workflow management systems primarily consist of data flow oriented execution models, and consequently, these systems provide a limited number of control flow constructs that are represented in dissimilar ways across different scientific workflow systems. This is a problem, since the exploratory nature of scientific analysis requires the workflows to dynamically adapt to external events and control execution of different workflow components. Hence some degree of control flow is necessary. The lack of standard specifications for specifying control flow constructs in scientific workflow management systems leads to workflows designed using custom developed components with almost no reusability. In this paper, we present a standard set of control flow constructs for scientific workflow management systems using workflow patterns. Firstly we compare the control flow constructs present in three scientific workflow management systems: Kepler, Taverna and Triana. Secondly these patterns are implemented in the form of a template library in Kepler. Finally, we demonstrate the use of this template library to design scientific workflows.

[1]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .

[2]  Carole A. Goble,et al.  Workflow discovery: the problem, a case study from e-Science and a graph-based solution , 2006, 2006 IEEE International Conference on Web Services (ICWS'06).

[3]  Jason Maassen,et al.  Programming Scientific and Distributed Workflow with Triana Services , 2004 .

[4]  Wil M. P. van der Aalst,et al.  Pattern-Based Analysis of the Control-Flow Perspective of UML Activity Diagrams , 2005, ER.

[5]  Alexander García Castro,et al.  Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator , 2004, BMC Bioinformatics.

[6]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[7]  Paul Roe,et al.  GPFlow: an intuitive environment for web-based scientific workflow , 2008 .

[8]  Arun Krishnan,et al.  GEL: Grid execution language , 2005, J. Parallel Distributed Comput..

[9]  Yolanda Gil,et al.  Pegasus: Mapping Scientific Workflows onto the Grid , 2004, European Across Grids Conference.

[10]  Ellis Horowitz,et al.  An Expansive View of Reusable Software , 1984, IEEE Transactions on Software Engineering.

[11]  Wil M. P. van der Aalst,et al.  Analysis of Web Services Composition Languages: The Case of BPEL4WS , 2003, ER.

[12]  Geoffrey C. Fox,et al.  Examining the Challenges of Scientific Workflows , 2007, Computer.

[13]  Roger S. Pressman,et al.  Software Engineering: A Practitioner's Approach , 1982 .

[14]  Zoé Lacroix,et al.  ProtocolDB: Storing Scientific Protocols with a Domain Ontology , 2007, WISE Workshops.

[15]  Ian J. Taylor,et al.  The Triana Workflow Environment: Architecture and Applications , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[16]  Yong Zhao,et al.  Scientific Workflow Systems for 21st Century, New Bottle or New Wine? , 2008, 2008 IEEE Congress on Services - Part I.

[17]  Jing Hua,et al.  A Reference Architecture for Scientific Workflow Management Systems and the VIEW SOA Solution , 2009, IEEE Transactions on Services Computing.

[18]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[19]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2004, Distributed and Parallel Databases.

[20]  Cláudio T. Silva,et al.  VisTrails: visualization meets data management , 2006, SIGMOD Conference.

[21]  Gregor von Laszewski,et al.  Swift: Fast, Reliable, Loosely Coupled Parallel Computation , 2007, 2007 IEEE Congress on Services (Services 2007).