Approximation algorithms for querying incomplete databases

Abstract Certain answers are a widely accepted semantics of query answering over incomplete databases. As their computation is a coNP-hard problem, recent research has focused on developing (polynomial time) evaluation algorithms with correctness guarantees, that is, techniques computing a sound but possibly incomplete set of certain answers. The aim is to make the computation of certain answers feasible in practice, settling for under-approximations. In this paper, we present novel evaluation algorithms with correctness guarantees, which provide better approximations than current techniques, while retaining polynomial time data complexity. The central tools of our approach are conditional tables and the conditional evaluation of queries. We propose different strategies to evaluate conditions, leading to different approximation algorithms—more accurate evaluation strategies have higher running times, but they pay off with more certain answers being returned. Thus, our approach offers a suite of approximation algorithms enabling users to choose the technique that best meets their needs in terms of balance between efficiency and quality of the results.

[1]  Raymond Reiter,et al.  A sound and sometimes complete query evaluation algorithm for relational databases with null values , 1986, JACM.

[2]  Sergio Greco,et al.  A three-valued semantics for querying and repairing inconsistent databases , 2007, Annals of Mathematics and Artificial Intelligence.

[3]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[4]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2005, Theor. Comput. Sci..

[5]  Paolo Papotti,et al.  The LLUNATIC Data-Cleaning Framework , 2013, Proc. VLDB Endow..

[6]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[7]  Guillermo Ricardo Simari,et al.  Datalog+- Ontology Consolidation , 2016, J. Artif. Intell. Res..

[8]  Wenfei Fan,et al.  Capturing missing tuples and missing values , 2010, PODS.

[9]  Jan Chomicki,et al.  Disjunctive databases for representing repairs , 2008, Annals of Mathematics and Artificial Intelligence.

[10]  Sergio Greco,et al.  Certain Query Answering in Partially Consistent Databases , 2014, Proc. VLDB Endow..

[11]  John Grant,et al.  Null Values in a Relational Data Base , 1977, Inf. Process. Lett..

[12]  Tomasz Imielinski,et al.  Incomplete deductive databases , 1991, Annals of Mathematics and Artificial Intelligence.

[13]  Ronald Fagin,et al.  Reverse data exchange: Coping with nulls , 2011 .

[14]  Jef Wijsen,et al.  Consistent Query Answering for Primary Keys , 2016, SGMD.

[15]  Andrea Calì,et al.  Advanced processing for ontological queries , 2010, Proc. VLDB Endow..

[16]  Serge Abiteboul,et al.  On the Representation and Querying of Sets of Possible Worlds , 1991, Theor. Comput. Sci..

[17]  Michaël Thomazo,et al.  An Introduction to Ontology-Based Query Answering with Existential Rules , 2014, Reasoning Web.

[18]  Gösta Grahne,et al.  The Problem of Incomplete Information in Relational Databases , 1991, Lecture Notes in Computer Science.

[19]  Leonid Libkin,et al.  SQL’s Three-Valued Logic and Certain Answers , 2016, TODS.

[20]  Andrea Calì,et al.  A general Datalog-based framework for tractable query answering over ontologies , 2012, J. Web Semant..

[21]  Leonid Libkin Certain answers as objects and knowledge , 2016, Artif. Intell..

[22]  Marcelo Arenas,et al.  Foundations of Data Exchange , 2014 .

[23]  Carsten Lutz,et al.  Ontology-Based Data Access: A Study through Disjunctive Datalog, CSP, and MMSNP , 2014, ACM Trans. Database Syst..