Explaining Missing Answers to Top-k SQL Queries

Due to the fact that existing database systems are increasingly more difficult to use, improving the quality and the usability of database systems has gained tremendous momentum over the last few years. In particular, the feature of explaining why some expected tuples are missing in the result of a query has received more attention. In this paper, we study the problem of explaining missing answers to top-k queries in the context of SQL (i.e., with selection, projection, join, and aggregation). To approach this problem, we use the query-refinement method. That is, given as inputs the original top-k SQL query and a set of missing tuples, our algorithms return to the user a refined query that includes both the missing tuples and the original query results. Case studies and experimental results show that our algorithms are able to return high quality explanations efficiently.

[1]  Gautam Das,et al.  A Probabilistic Optimization Framework for the Empty-Answer Problem , 2013, Proc. VLDB Endow..

[2]  Amihai Motro SEAVE: a mechanism for verifying user presuppositions in query systems , 1986, TOIS.

[3]  Chengfei Liu,et al.  FlexIQ: A flexible interactive Querying Framework by Exploiting the Skyline Operator , 2014, J. Syst. Softw..

[4]  James Cheney,et al.  Provenance in Databases: Why, How, and Where , 2009, Found. Trends Databases.

[5]  Anthony K. H. Tung,et al.  Call to order: a hierarchical browsing approach to eliciting users' preference , 2010, SIGMOD Conference.

[6]  Dan Suciu,et al.  SnipSuggest: Context-Aware Autocompletion for SQL , 2010, Proc. VLDB Endow..

[7]  Gang Chen,et al.  Answering Why-not Questions on Reverse Top-k Queries , 2015, Proc. VLDB Endow..

[8]  Prasad Deshpande,et al.  Efficient reverse skyline retrieval with arbitrary non-metric similarity measures , 2011, EDBT/ICDT '11.

[9]  R. Varshney,et al.  Supporting top-k join queries in relational databases , 2011 .

[10]  Jianliang Xu,et al.  Answering why-not questions on spatial keyword top-k queries , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[11]  Melanie Herschel,et al.  Explaining missing answers to SPJUA queries , 2010, Proc. VLDB Endow..

[12]  Quoc Trung Tran,et al.  How to ConQueR why-not questions , 2010, SIGMOD Conference.

[13]  Christos Doulkeridis,et al.  Reverse top-k queries , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[14]  Eric Lo,et al.  Answering Why-Not Questions on Top-K Queries , 2012, IEEE Transactions on Knowledge and Data Engineering.

[15]  Jeffrey F. Naughton,et al.  On the provenance of non-answers to queries over extracted data , 2008, Proc. VLDB Endow..

[16]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS.

[17]  Nick Koudas,et al.  Interactive query refinement , 2009, EDBT '09.

[18]  Adriane Chapman,et al.  Why Not? , 1965, SIGMOD Conference.

[19]  Chengfei Liu,et al.  On answering why-not questions in reverse skyline queries , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[20]  Amihai Motro,et al.  Query Generalization: A Method for Interpreting Null Answers , 1984, Expert Database Workshop.

[21]  Jianxin Li,et al.  Efficient Answering of Why-Not Questions in Similar Graph Matching , 2015, IEEE Transactions on Knowledge and Data Engineering.

[22]  Hao Wu,et al.  Seaform: Search-As-You-Type in Forms , 2010, Proc. VLDB Endow..

[23]  Surajit Chaudhuri,et al.  DBXplorer: a system for keyword-based search over relational databases , 2002, Proceedings 18th International Conference on Data Engineering.

[24]  Chengfei Liu,et al.  A framework for query refinement with user feedback , 2013, J. Syst. Softw..

[25]  Miryung Kim,et al.  Titian: Data Provenance Support in Spark , 2015, Proc. VLDB Endow..

[26]  Kevin Chen-Chuan Chang,et al.  Supporting ad-hoc ranking aggregates , 2006, SIGMOD Conference.

[27]  Neoklis Polyzotis,et al.  SQL QueRIE recommendations , 2010, Proc. VLDB Endow..