Data Provenance Inference in Logic Programming: Reducing Effort of Instance-driven Debugging

Data provenance allows scientists in different domains validating their models and algorithms to find out anomalies and unexpected behaviors. In previous works, we described on-the-fly interpretation of (Python) scripts to build workflow provenance graph automatically and then infer fine-grained provenance information based on the workflow provenance graph and the availability of data. To broaden the scope of our approach and demonstrate its viability, in this paper we extend it beyond procedural languages, to be used for purely declarative languages such as logic programming under the stable model semantics. For experiments and validation, we use the Answer Set Programming solver oClingo, which makes it possible to formulate and solve stream reasoning problems in a purely declarative fashion. We demonstrate how the benefits of the provenance inference over the explicit provenance still holds in a declarative setting, and we briefly discuss the potential impact for declarative programming, in particular for instance-driven debugging of the model in declarative problem solving.

[1]  Stefan Woltran,et al.  Debugging ASP Programs by Means of ASP , 2007, LPNMR.

[2]  Peter Buneman,et al.  Provenance in databases , 2009, SIGMOD '07.

[3]  H. Tompits,et al.  Catching the Ouroboros: On debugging non-ground answer-set programs , 2010, Theory and Practice of Logic Programming.

[4]  Chitta Baral,et al.  Knowledge Representation, Reasoning and Declarative Problem Solving , 2003 .

[5]  Martin Gebser,et al.  Answer Set Programming for Stream Reasoning , 2013, ArXiv.

[6]  Yogesh L. Simmhan,et al.  Karma2: Provenance Management for Data-Driven Workflows , 2008, Int. J. Web Serv. Res..

[7]  Hans Tompits,et al.  A Meta-Programming Technique for Debugging Answer-Set Programs , 2008, AAAI.

[8]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[9]  Martin Gebser,et al.  Detecting inconsistencies in large biological networks with answer set programming , 2008, Theory and Practice of Logic Programming.

[10]  Enrico Pontelli,et al.  Justifications for Logic Programs Under Answer Set Semantics , 2006, ICLP.

[11]  Hans Tompits,et al.  Stepping through an Answer-Set Program , 2011, LPNMR.

[12]  John S. Heidemann,et al.  Provenance in Sensornet Republishing , 2008, IPAW.

[13]  Alessandra Mileo,et al.  Reasoning support for risk prediction and prevention in independent living , 2010, Theory and Practice of Logic Programming.

[14]  Martin Gebser,et al.  Answer Set Solving in Practice , 2012, Answer Set Solving in Practice.

[15]  Jennifer Widom,et al.  LIVE: A Lineage-Supported Versioned DBMS , 2010, SSDBM.

[16]  Chitta Baral Knowledge Representation, Reasoning and Declarative Problem Solving: Query answering and answer set computing systems , 2003 .

[17]  Esra Erdem,et al.  Finding Answers and Generating Explanations for Complex Biomedical Queries , 2011, AAAI.

[18]  Andreas Wombacher,et al.  Fine-Grained Provenance Inference for a Large Processing Chain with Non-materialized Intermediate Views , 2012, SSDBM.

[19]  Jeffrey F. Naughton,et al.  Instrumenting a logic programming language to gather provenance from an information extraction application , 2012, WWW.

[20]  Andreas Wombacher,et al.  From scripts towards provenance inference , 2012, 2012 IEEE 8th International Conference on E-Science.

[21]  Yogesh L. Simmhan,et al.  A survey of data provenance in e-science , 2005, SGMD.

[22]  Simon Miles Automatically Adapting Source Code to Document Provenance , 2010, IPAW.

[23]  Martin Gebser,et al.  Stream Reasoning with Answer Set Programming: Preliminary Report , 2012, KR.

[24]  Andreas Wombacher,et al.  Probabilistic Inference of Fine-Grained Data Provenance , 2012, DEXA.

[25]  Vladimir Lifschitz,et al.  Answer set programming and plan generation , 2002, Artif. Intell..

[26]  Arthur B. Markman,et al.  Knowledge Representation , 1998 .

[27]  Francesco Ricca,et al.  ASPIDE: Integrated Development Environment for Answer Set Programming , 2011, LPNMR.

[28]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[29]  Paul T. Groth,et al.  Automatic Metadata Annotation through Reconstructing Provenance , 2012, SWPM@ESWC.

[30]  Hans Tompits,et al.  Kara: A System for Visualising and Visual Editing of Interpretations for Answer-Set Programs , 2011, INAP/WLP.

[31]  Marina De Vos,et al.  LOG-IDEAH: ASP for Architectonic Asset Preservation , 2012, ICLP.

[32]  Torsten Schaub,et al.  Knowledge-based multi-criteria optimization to support indoor positioning , 2011, Annals of Mathematics and Artificial Intelligence.

[33]  Martin Gebser,et al.  Stream Reasoning with Answer Set Programming: Extended Version , 2012 .

[34]  Andreas Wombacher,et al.  Inferring Fine-Grained Data Provenance in Stream Data Processing: Reduced Storage Cost, High Accuracy , 2011, DEXA.

[35]  Marina De Vos,et al.  Debugging Logic Programs under the Answer Set Semantics , 2005, Answer Set Programming.