Semantically Linking and Browsing Provenance Logs for E-science

e-Science experiments are those performed using computer-based resources such as database searches, simulations or other applications. Like their laboratory based counterparts, the data associated with an e-Science experiment are of reduced value if other scientists are not able to identify the origin, or provenance, of those data. Provenance is the term given to metadata about experiment processes, the derivation paths of data, and the sources and quality of experimental components, which includes the scientists themselves, related literature, etc. Consequently provenance metadata are valuable resources for e-Scientists to repeat experiments, track versions of data and experiment runs, verify experiment results, and as a source of experimental insight. One specific kind of in silico experiment is a workflow. In this paper we describe how we can assemble a Semantic Web of workflow provenance logs that allows a bioinformatician to browse and navigate between experimental components by generating hyperlinks based on semantic annotations associated with them. By associating well-formalized semantics with workflow logs we take a step towards integration of process provenance information and improved knowledge discovery.

[1]  Anil Wipat,et al.  Experiences with e-Science workflow specification and enactment in bioinformatics , 2003 .

[2]  Carole A. Goble,et al.  A Suite of Daml+Oil Ontologies to Describe Bioinformatics Web Services and Data , 2003, Int. J. Cooperative Inf. Syst..

[3]  Dieter Pfoser Indexing the Trajectories of Moving Objects , 2002 .

[4]  Enrico Motta,et al.  Magpie - Towards a Semantic Web Browser , 2003, SEMWEB.

[5]  Ted Ashworth Review: The Grid – Blueprint for a New Computing Infrastructure , 1999 .

[6]  Chris Greenhalgh,et al.  Performing \emph{In Silico} Experiments on the Grid: A Users' Perspective , 2003 .

[7]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[8]  Sean Martin,et al.  Globally distributed object identification for biological knowledgebases , 2004, Briefings Bioinform..

[9]  Carole A. Goble,et al.  Automating experiments using semantic data in a bioinformatics grid , 2004, IEEE Intelligent Systems.

[10]  Robert Stevens,et al.  Performing in silico Experiments on the Grid : A Users Perspective , 2003 .

[11]  Ian Horrocks,et al.  A proposal for an owl rules language , 2004, WWW '04.

[12]  Luc Moreau,et al.  The myGrid Notification Service , 2003 .

[13]  Jeremy J. Carroll,et al.  Resource description framework (rdf) concepts and abstract syntax , 2003 .

[14]  Jerry R. Hobbs,et al.  DAML-S: Web Service Description for the Semantic Web , 2002, SEMWEB.

[15]  Arthur Stutt,et al.  MnM: Ontology Driven Semi-automatic and Automatic Support for Semantic Markup , 2002, EKAW.

[16]  Ian Horrocks,et al.  Description Logics as Ontology Languages for the Semantic Web , 2005, Mechanizing Mathematical Reasoning.

[17]  Carole A. Goble,et al.  Exploring Williams-Beuren syndrome using myGrid , 2004, ISMB/ECCB.

[18]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[19]  Luc Moreau,et al.  Provenance of e-Science Experiments - Experience from Bioinformatics , 2003 .

[20]  Ian Horrocks The FaCT System , 1998, TABLEAUX.

[21]  Robert Stevens,et al.  Semantic and Personalised Service Discovery , 2003 .

[22]  John A. Kunze,et al.  Dublin Core Metadata for Resource Discovery , 1998, RFC.

[23]  Michael Luck,et al.  On the use of agents in a BioInformatics grid , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[24]  Roy Dyckhoff Automated Reasoning with Analytic Tableaux and Related Methods , 2000, Lecture Notes in Computer Science.

[25]  Luc Moreau,et al.  Recording and Reasoning over Data Provenance in Web and Grid Services , 2003, OTM.