Reexamining Some Holy Grails of Data Provenance

We reconsider some of the explicit and implicit properties that underlie well-established definitions of data provenance semantics. Previous work on comparing provenance semantics has mostly focused on expressive power (does the provenance generated by a certain semantics subsume the provenance generated by other semantics) and on understanding whether a semantics is insensitive to query rewrite (i.e., do equivalent queries have the same provenance). In contrast, we try to investigate why certain semantics possess specific properties (like insensitivity) and whether these properties are always desirable. We present a new property stability with respect to query language extension that, to the best of our knowledge, has not been isolated and studied on its own.

[1]  Val Tannen,et al.  Provenance semirings , 2007, PODS.

[2]  Jennifer Widom,et al.  Lineage tracing for general data warehouse transformations , 2003, The VLDB Journal.

[3]  Sanjeev Khanna,et al.  Why and Where: A Characterization of Data Provenance , 2001, ICDT.

[4]  Val Tannen,et al.  Querying data provenance , 2010, SIGMOD Conference.

[5]  Dan Suciu,et al.  Causality in Databases , 2010, IEEE Data Eng. Bull..

[6]  Daniel Deutch,et al.  Provenance for aggregate queries , 2011, PODS.

[7]  Antonella Poggi,et al.  On database query languages for K-relations , 2010, J. Appl. Log..

[8]  James Cheney,et al.  Provenance in Databases: Why, How, and Where , 2009, Found. Trends Databases.

[9]  Val Tannen,et al.  Update Exchange with Mappings and Provenance , 2007, VLDB.

[10]  Jennifer Widom,et al.  Lineage tracing in data warehouses , 2001 .

[11]  Gustavo Alonso,et al.  Perm: Efficient Provenance Support for Relational Databases , 2010 .

[12]  Wang Chiew Tan Containment of Relational Queries with Annotation Propagation , 2003, DBPL.

[13]  Dan Suciu,et al.  WHY SO? or WHY NO? Functional Causality for Explaining Query Answers , 2009, MUD.

[14]  Todd J. Green,et al.  Containment of Conjunctive Queries on Annotated Relations , 2009, ICDT '09.

[15]  Gustavo Alonso,et al.  Perm: Processing Provenance and Data on the Same Data Model through Query Rewriting , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[16]  Partha Pratim Talukdar,et al.  The ORCHESTRA Collaborative Data Sharing System , 2008, SIGMOD Rec..

[17]  Val Tannen,et al.  Reconcilable differences , 2009, ICDT.

[18]  Jennifer Widom,et al.  Run-Time Translation of View Tuple Deletions Using Data Lineage , 2001 .

[19]  Jennifer Widom,et al.  Tracing the lineage of view data in a warehousing environment , 2000, TODS.

[20]  Wang Chiew Tan,et al.  DBNotes: a post-it system for relational databases based on provenance , 2005, SIGMOD '05.

[21]  Dan Suciu,et al.  The Complexity of Causality and Responsibility for Query Answers and non-Answers , 2010, Proc. VLDB Endow..

[22]  Wang Chiew Tan,et al.  An annotation management system for relational databases , 2004, The VLDB Journal.

[23]  Gustavo Alonso,et al.  Provenance for nested subqueries , 2009, EDBT '09.