Orchestrating caGrid Services in Taverna

caBIGtrade (the cancer Biomedical Informatics Gridtrade) is an open-source, open-access information network enabling cancer researchers to share tools, data, applications, and technologies. caGrid is the underlying service-based grid software infrastructure for caBIG, integrating distributed data and analytic resources into a virtual collaborative platform for cancer research. Within caGrid, many cancer-related data analysis and aggregation tasks can make use of "canned" sets of service invocations, or workflows. As a result, there is a need to orchestrate the invocation of caGrid services through the use of both a workflow language and tooling. In this paper, we first explain why we select Taverna as a candidate for workflow authoring and invocation. We then review the development of Taverna plug-ins in general, and describe how we extend Taverna to use caGrid services. We then detail a real-world example and the lessons learned from our research. Finally we conclude with a summary and a description of potential next steps.

[1]  Geoffrey Fox,et al.  Special Issue: Workflow in Grid Systems , 2006, Concurr. Comput. Pract. Exp..

[2]  Carole A. Goble,et al.  Taverna/myGrid: Aligning a Workflow System with the Life Sciences Community , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[3]  Thomas Friese,et al.  Grid Workflow Modelling Using Grid-Specific BPEL Extensions , 2007 .

[4]  Cees T. A. M. de Laat,et al.  WS-VLAM: A GT4 Based Workflow Management System , 2007, International Conference on Computational Science.

[5]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[6]  Joel H. Saltz,et al.  caGrid: design and implementation of the core architecture of the cancer biomedical informatics grid , 2006, Bioinform..

[7]  Ian T. Foster,et al.  Modeling and Managing State in Distributed Systems: The Role of OGSI and WSRF , 2005, Proceedings of the IEEE.

[8]  Yolanda Gil,et al.  Pegasus: Mapping Scientific Workflows onto the Grid , 2004, European Across Grids Conference.

[9]  D. Gannon,et al.  Enabling Web Service extensions for scientific workflows , 2006, 2006 Workshop on Workflows in Support of Large-Scale Science.

[10]  C. Street,et al.  The Cancer Biomedical Informatics Grid (caBIGTM) , 2005, 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference.

[11]  Herman Lam,et al.  A service-oriented, scalable approach to grid-enabling of legacy scientific applications , 2005, IEEE International Conference on Web Services (ICWS'05).

[12]  Liana L. Fong,et al.  BPEL4Job: A Fault-Handling Design for Job Flow Management , 2007, ICSOC.

[13]  Johan Montagnat,et al.  Grid-enabled workflows for data intensive medical applications , 2005, 18th IEEE Symposium on Computer-Based Medical Systems (CBMS'05).

[14]  Francisco Curbera,et al.  Web Services Business Process Execution Language Version 2.0 , 2007 .

[15]  Ian T. Foster,et al.  Security for Grid services , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[16]  Ian T. Foster,et al.  Globus Toolkit Version 4: Software for Service-Oriented Systems , 2005, Journal of Computer Science and Technology.

[17]  Edward A. Lee,et al.  Implementing BPEL4WS: the architecture of a BPEL4WS implementation: Research Articles , 2006 .

[18]  Gregor von Laszewski,et al.  Swift: Fast, Reliable, Loosely Coupled Parallel Computation , 2007, 2007 IEEE Congress on Services (Services 2007).

[19]  Amit P. Sheth,et al.  An overview of workflow management: From process modeling to workflow automation infrastructure , 1995, Distributed and Parallel Databases.