A Strategy for Provenance Gathering in Distributed Scientific Workflows

Running scientific workflows in distributed environments is motivating the definition of provenance approaches that are loosely coupled to the workflow system. This kind of approach is interesting because it allows both storage and access to provenance data in an integrated way, even in an environment where different workflow management systems work together. In order to provide provenance functionalities, the existing approaches overload scientists with many manually computing tasks, such as script adaptations and implementations of extra functionalities. However, when we are dealing with users who do not have such expertise (the majority of scientists do not have it), this is not a good solution. Hence, the objective of this paper is to define a provenance strategy that facilitates the gathering of provenance information in a distributed environment scenario.

[1]  Cláudio T. Silva,et al.  VisTrails: visualization meets data management , 2006, SIGMOD Conference.

[2]  Marta Mattoso,et al.  Provenance Services for Distributed Workflows , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).

[3]  Chris Greenhalgh,et al.  Performing \emph{In Silico} Experiments on the Grid: A Users' Perspective , 2003 .

[4]  Yogesh L. Simmhan,et al.  A Framework for Collecting Provenance in Data-Centric Scientific Workflows , 2006, 2006 IEEE International Conference on Web Services (ICWS'06).

[5]  J. DuMond,et al.  An Introduction to Scientific Research , 1953 .

[6]  Cláudio T. Silva,et al.  Provenance for Computational Tasks: A Survey , 2008, Computing in Science & Engineering.

[7]  Paul T. Groth,et al.  Applying the Provenance Data Model to a Bioinformatics Case , 2006, High Performance Computing Workshop.

[8]  Yogesh L. Simmhan,et al.  A survey of data provenance in e-science , 2005, SGMD.

[9]  Paul T. Groth,et al.  An Architecture for Provenance Systems , 2006 .

[10]  Jing Hua,et al.  Service-Oriented Architecture for VIEW: A Visual Scientific Workflow Management System , 2008, 2008 IEEE International Conference on Services Computing.

[11]  Simon Miles,et al.  PrIMe: a software engineering methodology for developing provenance-aware applications , 2006, SEM '06.

[12]  Carole A. Goble,et al.  Seven Bottlenecks to Workflow Reuse and Repurposing , 2005, International Semantic Web Conference.

[13]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[14]  Robert Stevens,et al.  Performing in silico Experiments on the Grid : A Users Perspective , 2003 .

[15]  Marta Mattoso,et al.  Using Explicit Control Processes in Distributed Workflows to Gather Provenance , 2008, IPAW.