SciProv: An Architecture for Semantic Query in Provenance Metadata on e-Science Context

This article describes SciProv, an architecture that aims to interact with Scientific Workflow Management Systems in order to capture and manipulate provenance metadata. For this purpose, SciProv adopts an approach based on an abstract model for representing the lineage. This model, called Open Provenance Model (OPM), allows that SciProv can set up a homogeneous and interoperable infrastructure for handling provenance metadata. As a result, SciProv is able to provide a framework for query metadata provenance generated in an e-Science scenario. Moreover, the architecture uses semantic web technology in order to process provenance queries. In this context, using ontologies and inference engines, SciProv can make inferences about lineage and, based on these inferences, obtain important results based on extraction of information beyond those that are registered explicitly from the data managed.

[1]  T. Cavalier-smith Only six kingdoms of life , 2004, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[2]  Jennifer Golbeck,et al.  A Semantic Web approach to the provenance challenge , 2008 .

[3]  Robert Stevens,et al.  Mining Taverna's semantic web of provenance , 2008 .

[4]  Marta Mattoso,et al.  Towards a Taxonomy of Provenance in Scientific Workflow Management Systems , 2009, 2009 Congress on Services - I.

[5]  Carmem S. Hara,et al.  Querying and Managing Provenance through User Views in Scientific Workflows , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[6]  Juliana Freire,et al.  Provenance and scientific workflows: challenges and opportunities , 2008, SIGMOD Conference.

[7]  Bertram Ludäscher,et al.  Scientific workflow management and the Kepler system: Research Articles , 2006 .

[8]  Yogesh L. Simmhan,et al.  The Open Provenance Model core specification (v1.1) , 2011, Future Gener. Comput. Syst..

[9]  Dennis Gannon,et al.  Workflows for e-Science, Scientific Workflows for Grids , 2014 .

[10]  Hideaki Sugawara,et al.  DNA Data Bank of Japan (DDBJ) for genome scale research in life science , 2002, Nucleic Acids Res..

[11]  Simon Miles,et al.  PrIMe: a software engineering methodology for developing provenance-aware applications , 2006, SEM '06.

[12]  Gary M. Olson The next generation of science collaboratories , 2009 .

[13]  Paul T. Groth,et al.  Pipeline-centric provenance model , 2009, WORKS '09.

[14]  Cláudio T. Silva,et al.  Towards Enabling Social Analysis of Scientific Data , 2008 .

[15]  Carole A. Goble,et al.  Taverna/myGrid: Aligning a Workflow System with the Life Sciences Community , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[16]  Roger Barga,et al.  Automatic capture and efficient storage of e-Science experiment provenance , 2008 .

[17]  Yogesh L. Simmhan,et al.  A Framework for Collecting Provenance in Data-Centric Scientific Workflows , 2006, 2006 IEEE International Conference on Web Services (ICWS'06).

[18]  Cláudio T. Silva,et al.  Managing Rapidly-Evolving Scientific Workflows , 2006, IPAW.

[19]  Luc Moreau,et al.  The Open Provenance Model , 2007 .

[20]  Shahar Ronen,et al.  Authenticity and Provenance in Long Term Digital Preservation: Modeling and Implementation in Preservation Aware Storage , 2009, Workshop on the Theory and Practice of Provenance.

[21]  Simon Miles Electronically Querying for the Provenance of Entities , 2006, IPAW.

[22]  Claudia Bauzer Medeiros,et al.  A framewok based in Web services orchestration for bioinformatics workflow management , 2005, WOB.

[23]  Cláudio T. Silva,et al.  Provenance for Computational Tasks: A Survey , 2008, Computing in Science & Engineering.

[24]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[25]  Luc Moreau,et al.  The Open Provenance Model: An Overview , 2008, IPAW.

[26]  Sanjeev Khanna,et al.  Why and Where: A Characterization of Data Provenance , 2001, ICDT.

[27]  Paul T. Groth,et al.  An Architecture for Provenance Systems , 2006 .