Building Scientific Workflow with Taverna and BPEL: A Comparative Study in caGrid

With the emergence of "service oriented science," the need arises to orchestrate various services to facilitate scientific investigation --- that is, to create "science workflows." In this paper we summarize our findings in providing a workflow solution for the caGrid service-based grid infrastructure. We choose BPEL and Taverna as candidate solutions, and compare their usability in the full lifecycle of a scientific workflow, including service discovery, service composition, workflow execution, and workflow result analysis. We determine that BPEL offers a comprehensive set of primitives for modeling processes of all flavors, while Taverna provides a more compact set of primitives and a functional programming model that eases data flow modeling. We hope that our analysis not only helps researchers choose a tool that meets their needs, but also provides some insight on how a workflow language and tool can fulfill the requirement of scientists.

[1]  Liana L. Fong,et al.  BPEL4Job: A Fault-Handling Design for Job Flow Management , 2007, ICSOC.

[2]  Priya Narasimhan,et al.  Service-Oriented Computing - ICSOC 2007, Fifth International Conference, Vienna, Austria, September 17-20, 2007, Proceedings , 2007, ICSOC.

[3]  Carole A. Goble,et al.  Data Lineage Model for Taverna Workflows with Lightweight Annotation Requirements , 2008, IPAW.

[4]  Yogesh L. Simmhan,et al.  A survey of data provenance in e-science , 2005, SGMD.

[5]  Mark Hedges,et al.  Arts and Humanities e-Science From Ad Hoc Experimentation to Systematic Investigation , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[6]  Carole A. Goble,et al.  Feta: A Light-Weight Architecture for User Oriented Semantic Service Discovery , 2005, ESWC.

[7]  Matthew S. Shields Control- Versus Data-Driven Workflows , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[8]  Joel H. Saltz,et al.  caGrid: design and implementation of the core architecture of the cancer biomedical informatics grid , 2006, Bioinform..

[9]  Liang Chen,et al.  Sedna: A BPEL-Based Environment for Visual Scientific Workflow Modeling , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[10]  Ian T. Foster,et al.  Orchestrating caGrid Services in Taverna , 2008, 2008 IEEE International Conference on Web Services.

[11]  Carole A. Goble,et al.  The myGrid ontology: bioinformatics service discovery , 2007, Int. J. Bioinform. Res. Appl..

[12]  Francisco Curbera,et al.  Web Services Business Process Execution Language Version 2.0 , 2007 .

[13]  Lora Aroyo,et al.  The Semantic Web: Research and Applications , 2009, Lecture Notes in Computer Science.

[14]  Jeffrey M. Bradshaw,et al.  Applying KAoS Services to Ensure Policy Compliance for Semantic Web Services Workflow Composition and Enactment , 2004, SEMWEB.

[15]  Karan Bhatia,et al.  SOAs for Scientific Applications: Experiences and Challenges , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[16]  Carole A. Goble,et al.  Taverna Workflows: Syntax and Semantics , 2007, Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007).

[17]  I. Foster,et al.  Service-Oriented Science , 2005, Science.

[18]  Ian T. Foster Globus Toolkit Version 4: Software for Service-Oriented Systems , 2005, NPC.

[19]  Carole A. Goble,et al.  Taverna/myGrid: Aligning a Workflow System with the Life Sciences Community , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[20]  Carole A. Goble,et al.  Using Semantic Web Technologies for Representing E-science Provenance , 2004, SEMWEB.

[21]  Junwei Cao Cyberinfrastructure Technologies and Applications , 2009 .