Wondering why data are missing from query results?: ask conseil why-not

In analyzing and debugging data transformations, or more specifically relational queries, a subproblem is to understand why some data are not part of the query result. This problem has recently been addressed from different perspectives for various fragments of relational queries. The different perspectives yield different, yet complementary explanations of such missing-answers. This paper first aims at unifying the different approaches by defining a new type of explanation, called hybrid explanation, that encompasses the variety of previously defined types of explanations. This solution goes beyond simply forming the union of explanations produced by different algorithms and is shown to be able to explain a larger set of missing-answers. Second, we present Conseil, an algorithm to generate hybrid explanations. Conseil is also the first algorithm to handle non-monotonic queries. Experiments on efficiency and explanation quality show that Conseil is comparable to and even outperforms previous algorithms.

[1]  Rajasekar Krishnamurthy,et al.  HIL: a high-level scripting language for entity integration , 2013, EDBT '13.

[2]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[3]  Dan Suciu,et al.  SnipSuggest: Context-Aware Autocompletion for SQL , 2010, Proc. VLDB Endow..

[4]  Wolfgang Gatterbauer,et al.  QueryViz: helping users understand SQL queries and their patterns , 2011, EDBT/ICDT '11.

[5]  Melanie Herschel,et al.  The nautilus analyzer: understanding and debugging data transformations , 2012, CIKM '12.

[6]  César A. Galindo-Legaria,et al.  Outerjoins as disjunctions , 1994, SIGMOD '94.

[7]  Torsten Grust,et al.  Observing SQL queries in their natural habitat , 2013, TODS.

[8]  James Cheney,et al.  Provenance in Databases: Why, How, and Where , 2009, Found. Trends Databases.

[9]  Quoc Trung Tran,et al.  How to ConQueR why-not questions , 2010, SIGMOD Conference.

[10]  Sanjeev Khanna,et al.  Edinburgh Research Explorer On the Propagation of Deletions and Annotations through Views , 2013 .

[11]  Jennifer Widom,et al.  Tracing the lineage of view data in a warehousing environment , 2000, TODS.

[12]  Eric Lo,et al.  Answering Why-Not Questions on Top-K Queries , 2012, IEEE Transactions on Knowledge and Data Engineering.

[13]  H. V. Jagadish,et al.  Guided Interaction: Rethinking the Query-Result Paradigm , 2011, Proc. VLDB Endow..

[14]  Dan Suciu,et al.  The Complexity of Causality and Responsibility for Query Answers and non-Answers , 2010, Proc. VLDB Endow..

[15]  Adriane Chapman,et al.  Why Not? , 1965, SIGMOD Conference.

[16]  Jeffrey F. Naughton,et al.  On the provenance of non-answers to queries over extracted data , 2008, Proc. VLDB Endow..

[17]  Nicolas Spyratos,et al.  Update semantics of relational views , 1981, TODS.

[18]  Melanie Herschel,et al.  Explaining missing answers to SPJUA queries , 2010, Proc. VLDB Endow..

[19]  Felix Naumann,et al.  Subsumption and complementation as data fusion operators , 2010, EDBT '10.