Bringing Provenance to Its Full Potential Using Causal Reasoning

Provenance information is often used to explain query results and outcomes, exploit results of prior reasoning, and establish trust in data. The generality of the notion makes it applicable in a variety of domains, including data warehousing [7], curated databases [4], and various scientific applications. The recent introduction of causal reasoning in a database setting exploits provenance in ways that expand its applicability to more complex problems, and establish new directions, making a step towards achieving provenance’s full potential. In this paper we explore through a variety of examples how causality improves on provenance information, discuss the challenges of building causality able systems, and propose some new directions.

[1]  James Cheney,et al.  Provenance in Databases: Why, How, and Where , 2009, Found. Trends Databases.

[2]  V. Vianu,et al.  Edinburgh Why and Where: A Characterization of Data Provenance , 2017 .

[3]  Yannis Papakonstantinou,et al.  Hypothetical Queries in an OLAP Environment , 2000, VLDB.

[4]  D. Hubin,et al.  THE JOURNAL OF PHILOSOPHY , 2004 .

[5]  James Cheney,et al.  Curated databases , 2008, PODS.

[6]  Dan Suciu,et al.  WHY SO? or WHY NO? Functional Causality for Explaining Query Answers , 2009, MUD.

[7]  Franz von Kutschera,et al.  Causation , 1993, J. Philos. Log..

[8]  Val Tannen,et al.  Provenance semirings , 2007, PODS.

[9]  Joseph Y. Halpern,et al.  Causes and Explanations: A Structural-Model Approach. Part I: Causes , 2000, The British Journal for the Philosophy of Science.

[10]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[11]  Dan Suciu,et al.  Causality in Databases , 2010, IEEE Data Eng. Bull..

[12]  Daniel Deutch,et al.  Provenance for aggregate queries , 2011, PODS.

[13]  Suman Nath,et al.  Tracing data errors with view-conditioned causality , 2011, SIGMOD '11.

[14]  Luc Moreau,et al.  Extracting causal graphs from an open provenance data model , 2008 .

[15]  Joseph Y. Halpern,et al.  Causes and explanations: A structural-model approach , 2000 .

[16]  Laks V. S. Lakshmanan,et al.  What-if OLAP Queries with Changing Dimensions , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[17]  Dan Suciu,et al.  The Complexity of Causality and Responsibility for Query Answers and non-Answers , 2010, Proc. VLDB Endow..

[18]  Kiran-Kumar Muniswamy-Reddy,et al.  Causality-based versioning , 2009, TOS.

[19]  James Cheney,et al.  Causality and the Semantics of Provenance , 2010, DCM.

[20]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[21]  Joseph Y. Halpern,et al.  Responsibility and Blame: A Structural-Model Approach , 2003, IJCAI.

[22]  Surajit Chaudhuri,et al.  Data Warehousing and OLAP for Decision Support (Tutorial) , 1997, SIGMOD Conference.