Algorithms for Computing Approximate Certain Answers over Incomplete Databases

Incomplete information arises in many database applications, such as data integration, data exchange, inconsistency management, data cleaning, ontological reasoning, and many others. A principled way of answering queries over incomplete databases is to compute certain answers, which are query answers that can be obtained from every complete database represented by an incomplete one. For databases containing (labeled) nulls, certain answers to positive queries can be easily computed in polynomial time, but for more general queries with negation the problem becomes coNP-hard. To make query answering feasible in practice, one might resort to SQL's evaluation, but unfortunately, the way SQL behaves in the presence of nulls may result in wrong answers. Thus, on the one hand, SQL's evaluation is efficient but flawed, on the other hand, certain answers are a principled semantics but with high complexity. To deal with issue, recent research has focused on developing polynomial time approximation algorithms for computing (approximate) certain answers. This paper surveys recent advances in this area.

[1]  Sergio Greco,et al.  Incomplete Data and Data Dependencies in Relational Databases , 2012, Incomplete Data and Data Dependencies in Relational Databases.

[2]  Leonid Libkin,et al.  Making SQL Queries Correct on Incomplete Databases: A Feasibility Study , 2016, PODS.

[3]  Magdalena Ortiz,et al.  Ontology-Mediated Query Answering with Data-Tractable Description Logics , 2015, Reasoning Web.

[4]  Laks V. S. Lakshmanan,et al.  Deductive Databases with Incomplete Information , 1992, JICSLP.

[5]  Leonid Libkin,et al.  On Querying Incomplete Information in Databases under Bag Semantics , 2017, IJCAI.

[6]  Serge Abiteboul,et al.  Update Semantics for Incomplete Databases , 1985, VLDB.

[7]  Leonid Libkin,et al.  SQL’s Three-Valued Logic and Certain Answers , 2016, TODS.

[8]  Jef Wijsen,et al.  The Data Complexity of Consistent Query Answering for Self-Join-Free Conjunctive Queries Under Primary Key Constraints , 2015, ACM Trans. Database Syst..

[9]  Leopoldo E. Bertossi,et al.  Database Repairing and Consistent Query Answering , 2011, Database Repairing and Consistent Query Answering.

[10]  John Grant,et al.  Null Values in a Relational Data Base , 1977, Inf. Process. Lett..

[11]  Tomasz Imielinski,et al.  Incomplete deductive databases , 1991, Annals of Mathematics and Artificial Intelligence.

[12]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[13]  Moshe Y. Vardi On the integrity of databases with incomplete information , 1985, PODS.

[14]  Andrea Calì,et al.  A general datalog-based framework for tractable query answering over ontologies , 2009, SEBD.

[15]  Sergio Greco,et al.  Computing Approximate Certain Answers over Incomplete Databases , 2017, AMW.

[16]  Witold Lipski,et al.  Incomplete Information in Relational Databases , 1989 .

[17]  Sergio Greco,et al.  ACID: A System for Computing Approximate Certain Query Answers over Incomplete Databases , 2018, SIGMOD Conference.

[18]  Leopoldo E. Bertossi Null Values , 2009, Encyclopedia of Database Systems.

[19]  Maurizio Lenzerini,et al.  On reconciling data exchange, data integration, and peer data management , 2007, PODS '07.

[20]  Gösta Grahne,et al.  Dependency Satisfaction in Databases with Incomplete Information , 1984, VLDB.

[21]  Raymond Reiter,et al.  A sound and sometimes complete query evaluation algorithm for relational databases with null values , 1986, JACM.

[22]  Andrea Calì,et al.  A general Datalog-based framework for tractable query answering over ontologies , 2012, J. Web Semant..

[23]  Marco Calautti,et al.  Exploiting Equality Generating Dependencies in Checking Chase Termination , 2016, Proc. VLDB Endow..

[24]  Wenfei Fan,et al.  Capturing missing tuples and missing values , 2010, PODS.

[25]  Jan Chomicki,et al.  Consistent query answers in inconsistent databases , 1999, PODS '99.

[26]  Leonid Libkin,et al.  Approximations and Refinements of Certain Answers via Many-Valued Logics , 2016, KR.

[27]  Sergio Greco,et al.  A three-valued semantics for querying and repairing inconsistent databases , 2007, Annals of Mathematics and Artificial Intelligence.