A More General Theory of Static Approximations for Conjunctive Queries

Conjunctive query (CQ) evaluation is NP-complete, but becomes tractable for fragments of bounded hypertreewidth. Approximating a hard CQ by a query from such a fragment can thus allow for an efficient approximate evaluation. While underapproximations (i.e., approximations that return correct answers only) are well-understood, the dual notion of overapproximations (i.e, approximations that return complete – but not necessarily sound – answers), and also a more general notion of approximation based on the symmetric difference of query results, are almost unexplored. In fact, the decidability of the basic problems of evaluation, identification, and existence of those approximations has been open. This article establishes a connection between overapproximations and existential pebble games that allows for studying such problems systematically. Building on this connection, it is shown that the evaluation and identification problem for overapproximations can be solved in polynomial time. While the general existence problem remains open, the problem is shown to be decidable in 2EXPTIME over the class of acyclic CQs and in PTIME for Boolean CQs over binary schemata. Additionally we propose a more liberal notion of overapproximations to remedy the known shortcoming that queries might not have an overapproximation, and study how queries can be overapproximated in the presence of tuple generating and equality generating dependencies. The techniques are then extended to symmetric difference approximations and used to provide several complexity results for the identification, existence, and evaluation problem for this type of approximations.

[1]  Ronald Fagin,et al.  A normal form for relational databases that is based on domains and keys , 1981, TODS.

[2]  Georg Gottlob,et al.  Hypertree decompositions and tractable queries , 1998, J. Comput. Syst. Sci..

[3]  Martin Otto,et al.  The Boundedness Problem for Monadic Universal First-Order Logic , 2006, 21st Annual IEEE Symposium on Logic in Computer Science (LICS'06).

[4]  Jaroslav Nesetril,et al.  Complexity of Tree Homomorphisms , 1996, Discret. Appl. Math..

[5]  Georg Gottlob,et al.  Querying the Guarded Fragment , 2010, 2010 25th Annual IEEE Symposium on Logic in Computer Science.

[6]  Haim Gaifman,et al.  Decidable optimization problems for database logic programs , 1988, STOC '88.

[7]  Yannis E. Ioannidis Approximations in Database Systems , 2003, ICDT.

[8]  Francesco Scarcello,et al.  The Power of Local Consistency in Conjunctive Queries and Constraint Satisfaction Problems , 2017, SIAM J. Comput..

[9]  Anand Rajaraman,et al.  Conjunctive query containment revisited , 1997, Theor. Comput. Sci..

[10]  Hubie Chen,et al.  Beyond Hypertree Width: Decomposition Methods Without Decompositions , 2005, CP.

[11]  Georg Gottlob,et al.  Semantic Acyclicity Under Constraints , 2016, AMW.

[12]  Thomas Schwentick,et al.  Generalized hypertree decompositions: np-hardness and tractable variants , 2007, PODS '07.

[13]  Phokion G. Kolaitis,et al.  On the expressive power of datalog: tools and a case study , 1990, J. Comput. Syst. Sci..

[14]  Phokion G. Kolaitis,et al.  Constraint Satisfaction, Bounded Treewidth, and Finite-Variable Logics , 2002, CP.

[15]  Sergio Greco,et al.  Querying Graph Databases , 2000, EDBT.

[16]  Achim Blumensath,et al.  Decidability Results for the Boundedness Problem , 2014, Log. Methods Comput. Sci..

[17]  Pablo Barceló,et al.  Efficient approximations of conjunctive queries , 2012, PODS '12.

[18]  Moshe Y. Vardi,et al.  Semantic acyclicity on graph databases , 2013, SIAM J. Comput..

[19]  Haim Gaifman,et al.  Decidable Optimization Problems for Database Logic Programs (Preliminary Report) , 1988, STOC 1988.

[20]  Dániel Marx,et al.  Constraint solving via fractional edge covers , 2006, SODA '06.

[21]  Andrea Calì,et al.  Taming the Infinite Chase: Query Answering under Expressive Relational Constraints , 2008, Description Logics.

[22]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2003, Theor. Comput. Sci..

[23]  Georg Gottlob,et al.  General and Fractional Hypertree Decompositions: Hard and Easy Cases , 2016, AMW.

[24]  Jaroslav Nesetril,et al.  The core of a graph , 1992, Discret. Math..

[25]  Phokion G. Kolaitis,et al.  On the Complexity of Existential Pebble Games , 2003, CSL.

[26]  Harry G. Mairson,et al.  Undecidable optimization problems for database logic programs , 1993, JACM.

[27]  Minos N. Garofalakis,et al.  Approximate Query Processing: Taming the TeraBytes , 2001, VLDB.

[28]  Ashok K. Chandra,et al.  Optimal implementation of conjunctive queries in relational data bases , 1977, STOC '77.

[29]  Alin Deutsch,et al.  The chase revisited , 2008, PODS.

[30]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[31]  Gautam Das,et al.  Approximate Query Processing , 2009, Encyclopedia of Database Systems.

[32]  Jaroslav Nesetril,et al.  Graphs and homomorphisms , 2004, Oxford lecture series in mathematics and its applications.

[33]  Georg Gottlob,et al.  Hypertree Decompositions: Questions and Answers , 2016, PODS.

[34]  Mihalis Yannakakis,et al.  Algorithms for Acyclic Database Schemes , 1981, VLDB.

[35]  Jianzhong Li,et al.  Graph pattern matching , 2010, Proc. VLDB Endow..

[36]  Phokion G. Kolaitis,et al.  Conjunctive-query containment and constraint satisfaction , 1998, PODS.

[37]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[38]  Dan Olteanu,et al.  On the optimal approximation of queries using tractable propositional languages , 2011, ICDT '11.

[39]  David Maier,et al.  Testing implications of data dependencies , 1979, SIGMOD '79.

[40]  Mihalis Yannakakis,et al.  On the Complexity of Database Queries , 1999, J. Comput. Syst. Sci..