Supporting Provenance in Service-oriented Computing Using the Semantic Web Technologies

The Web is evolving from a global information space to a collaborative problem solving environment in which services (resources) are dynamically discovered and composed into workflows for problem solving, and later disbanded. This gives rise to an increasing demand for provenance, which enables users to trace how a particular result has been arrived at by identifying the resources, configurations and execution settings. In this paper we analyse the nature of service-oriented computing and define a new conception called augmented provenance. Augmented provenance enhances conventional provenance data with extensive metadata and semantics, thus enabling large scale resource sharing and deep reuse. A Semantic Web Service (SWS) based, hybrid approach is proposed for the creation and management of augmented provenance in which semantic annotation is used to generate semantic provenance data and the database management system is used for execution data management. We present a general architecture for the approach and discuss mechanisms for modeling, capturing, recording and querying augmented provenance data. The approach has been applied to a real world application in which tools and GUIs are developed to facilitate provenance management and exploitation.

[1]  Carole A. Goble,et al.  Using Semantic Web Technologies for Representing E-science Provenance , 2004, SEMWEB.

[2]  Simon J. Cox,et al.  Engineering Knowledge for Engineering Grid Applications , 2002, EuroWeb.

[3]  Simon J. Cox,et al.  Databases, Workflows and the Grid in a Service Oriented Environment , 2004, Euro-Par.

[4]  Nicholas Gibbins,et al.  3store: Efficient Bulk RDF Storage , 2003, PSSS.

[5]  Deborah L. McGuinness,et al.  Knowledge Provenance Infrastructure , 2003, IEEE Data Eng. Bull..

[6]  I. Horrocks,et al.  The Instance Store: DL Reasoning with Large Numbers of Individuals , 2004, Description Logics.

[7]  Michael Luck,et al.  Logical architecture strawman for provenance systems , 2005 .

[8]  Carole A. Goble,et al.  Managing Semantic Metadata for Web/Grid Services , 2006, Int. J. Web Serv. Res..

[9]  Yong Zhao,et al.  Chimera: a virtual data system for representing, querying, and automating data derivation , 2002, Proceedings 14th International Conference on Scientific and Statistical Database Management.

[10]  Deborah L. McGuinness,et al.  Explaining answers from the Semantic Web: the Inference Web approach , 2004, J. Web Semant..

[11]  Volker Haarslev,et al.  Racer: A Core Inference Engine for the Semantic Web , 2003, EON.

[12]  Sanjeev Khanna,et al.  Why and Where: A Characterization of Data Provenance , 2001, ICDT.

[13]  Jennifer Widom,et al.  Tracing the lineage of view data in a warehousing environment , 2000, TODS.

[14]  Ian T. Foster,et al.  Grid Services for Distributed System Integration , 2002, Computer.

[15]  James Cheney,et al.  Provenance management in curated databases , 2006, SIGMOD Conference.

[16]  Rajendra Bose A conceptual framework for composing and managing scientific data lineage , 2002, Proceedings 14th International Conference on Scientific and Statistical Database Management.