A Characterization of the Complexity of Resilience and Responsibility for Conjunctive Queries

Several research thrusts in the area of data management have focused on understanding how changes in the data affect the output of a view or standing query. Example applications are explaining query results, propagating updates through views, and anonymizing datasets. These applications usually rely on understanding how interventions in a database impact the output of a query. An important aspect of this analysis is the problem of deleting a minimum number of tuples from the input tables to make a given Boolean query false. We refer to this problem as “the resilience of a query” and show its connections to the well-studied problems of deletion propagation and causal responsibility. We thus study the complexity of resilience for self-join-free conjunctive queries, and also make several contributions to previous known results for the problems of deletion propagation with source side-effects and causal responsibility : (1) We define the notion of resilience and provide a complete dichotomy for the class of self-join-free conjunctive queries with arbitrary functional dependencies; this dichotomy also extends and generalizes previous tractability results on deletion propagation with source side-effects. (2) We formalize the connection between resilience and causal responsibility, and show that resilience has a larger class of tractable queries than responsibility. (3) We identify a mistake in a previous dichotomy for the problem of causal responsibility and offer a revised characterization based on new, simpler, and more intuitive notions. (4) Finally, we extend the dichotomy for causal responsibility in two ways: (a) we treat cases where the input tables contain functional dependencies, and (b) we compute responsibility for a set of tuples specified via wildcards.

[1]  Jeffrey F. Naughton,et al.  On the provenance of non-answers to queries over extracted data , 2008, Proc. VLDB Endow..

[2]  Sanjeev Khanna,et al.  Edinburgh Research Explorer On the Propagation of Deletions and Annotations through Views , 2013 .

[3]  Dan Suciu,et al.  A formal approach to finding explanations for database queries , 2014, SIGMOD Conference.

[4]  V. Vianu,et al.  Edinburgh Why and Where: A Characterization of Data Provenance , 2017 .

[5]  Arthur M. Keller,et al.  Algorithms for translating view updates to database updates for views involving selections, projections, and joins , 1985, PODS.

[6]  Nicolas Spyratos,et al.  Update semantics of relational views , 1981, TODS.

[7]  David P. Woodruff,et al.  Multi-Tuple Deletion Propagation: Approximations and Complexity , 2013, Proc. VLDB Endow..

[8]  Ronald Fagin,et al.  On the semantics of updates in databases , 1983, PODS.

[9]  Jianzhong Li,et al.  On the Complexity of View Update Analysis and Its Application to Annotation Propagation , 2012, IEEE Transactions on Knowledge and Data Engineering.

[10]  Ur Informationssysteme,et al.  COMPLEXITY RESULTS FOR STRUCTURE-BASED CAUSALITY , 2001 .

[11]  Benny Kimelfeld,et al.  A dichotomy in the complexity of deletion propagation with functional dependencies , 2012, PODS '12.

[12]  Cong Yu,et al.  MapRat: Meaningful Explanation, Interactive Exploration and Geo-Visualization of Collaborative Ratings , 2012, Proc. VLDB Endow..

[13]  Melanie Herschel,et al.  Explaining missing answers to SPJUA queries , 2010, Proc. VLDB Endow..

[14]  Wang Chiew Tan,et al.  Artemis: A System for Analyzing Missing Answers , 2009, Proc. VLDB Endow..

[15]  Jan Vondrák,et al.  Maximizing conjunctive views in deletion propagation , 2012, TODS.

[16]  Samuel Madden,et al.  Scorpion: Explaining Away Outliers in Aggregate Queries , 2013, Proc. VLDB Endow..

[17]  Joseph Y. Halpern,et al.  Causes and explanations: A structural-model approach , 2000 .

[18]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[19]  Christos H. Papadimitriou,et al.  Updates of Relational Views , 1984, JACM.

[20]  Dan Suciu,et al.  The Complexity of Causality and Responsibility for Query Answers and non-Answers , 2010, Proc. VLDB Endow..

[21]  Suman Nath,et al.  Tracing data errors with view-conditioned causality , 2011, SIGMOD '11.

[22]  Dan Suciu,et al.  PerfXplain: Debugging MapReduce Job Performance , 2012, Proc. VLDB Endow..

[23]  Jennifer Widom,et al.  Tracing the lineage of view data in a warehousing environment , 2000, TODS.

[24]  Thomas Lukasiewicz,et al.  Causes and explanations in the structural-model approach: Tractable cases , 2002, Artif. Intell..

[25]  Daniel Fabbri,et al.  Explanation-Based Auditing , 2011, Proc. VLDB Endow..

[26]  Joseph Y. Halpern,et al.  Causes and Explanations: A Structural-Model Approach. Part I: Causes , 2000, The British Journal for the Philosophy of Science.

[27]  Umeshwar Dayal,et al.  On the correct translation of update operations on relational views , 1982, TODS.

[28]  Quoc Trung Tran,et al.  How to ConQueR why-not questions , 2010, SIGMOD Conference.

[29]  Neil Immerman,et al.  Descriptive Complexity , 1999, Graduate Texts in Computer Science.

[30]  Val Tannen,et al.  Provenance semirings , 2007, PODS.

[31]  Joseph Y. Halpern,et al.  Responsibility and Blame: A Structural-Model Approach , 2003, IJCAI.

[32]  Johannes Gehrke,et al.  Explainable security for relational databases , 2014, SIGMOD Conference.

[33]  Dimitrios Gunopulos,et al.  Parsimonious Explanations of Change in Hierarchical Data , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[34]  James Cheney,et al.  Provenance in Databases: Why, How, and Where , 2009, Found. Trends Databases.

[35]  Dimitrios Gunopulos,et al.  Efficient and effective explanation of change in hierarchical summaries , 2007, KDD '07.