Optimizing and implementing repair programs for consistent query answering in databases

Databases may not always satisfy their integrity constraints (ICs) and a number of different reasons can be held accountable for this. However, in most cases an important part of the data is still consistent with the ICs, and can still be retrieved through queries posed to the database. Consistent query answers are characterized as ordinary answers obtained from every minimally repaired and consistent version of the database. Database repairs wrt a wide class of ICs can be specified as stable models of disjunctive logic programs. Thus, Consistent Query Answering (CQA) for first-order queries is translated into cautious reasoning under the stable models semantics. The use of logic programs does not exceed the intrinsic complexity of CQA. However, using them in a straightforward manner is usually inefficient. The goal of this thesis is to develop optimized techniques to evaluate queries over inconsistent databases by using logic programs. More specifically, we optimize the structure of programs, model computation, and evaluation of queries from them. We develop a system which implements optimized logic programs and efficient methods to compute consistent answers to first-order queries. Moreover, we propose the use of the well-founded semantics (WFS) as an alternative way to obtain consistent answers. We show that for a certain class of queries and ICs, the well founded interpretation of a program retrieves the same consistent answers as the stable models semantics. The WFS has lower data complexity than the stable models semantics. We also extend the use of logic programs for retrieving consistent answers to aggregate queries, and we develop a repair semantics for Multidimensional Databases.

[1]  Leopoldo E. Bertossi,et al.  Querying Inconsistent Databases: Algorithms and Implementation , 2000, Computational Logic.

[2]  Leopoldo E. Bertossi,et al.  Complexity and Approximation of Fixing Numerical Attributes in Databases Under Integrity Constraints , 2005, DBPL.

[3]  Thomas Eiter,et al.  Efficient Evaluation of Logic Programs for Querying Data Integration Systems , 2003, ICLP.

[4]  V. S. Subrahmanian,et al.  Optimal models of disjunctive logic programs: semantics, complexity, and computation , 2004, IEEE Transactions on Knowledge and Data Engineering.

[5]  Jan Chomicki,et al.  On the Computational Complexity of Minimal-Change Integrity Maintenance in Relational Databases , 2005, Inconsistency Tolerance.

[6]  Wolfgang Lehner,et al.  Extending data warehouses by semiconsistent views , 2002, DMDW.

[7]  Jan Chomicki,et al.  Answer sets for consistent query answering in inconsistent databases , 2002, Theory and Practice of Logic Programming.

[8]  Dimitri Theodoratos,et al.  A general framework for the view selection problem for data warehouse design and evolution , 2000, DOLAP '00.

[9]  Francesco Scarcello,et al.  Census Data Repair: a Challenging Application of Disjunctive Logic Programming , 2001, LPAR.

[10]  Alberto O. Mendelzon,et al.  Capturing summarizability with integrity constraints in OLAP , 2005, TODS.

[11]  Wolfgang Faber,et al.  Aggregate Functions in DLV , 2003, Answer Set Programming.

[12]  Leopoldo E. Bertossi,et al.  Characterizing and Computing Semantically Correct Answers from Databases with Annotated Logic and Answer Sets , 2001, Semantics in Databases.

[13]  Michael Gelfond,et al.  Classical negation in logic programs and disjunctive databases , 1991, New Generation Computing.

[14]  Raymond Reiter,et al.  Towards a Logical Reconstruction of Relational Database Theory , 1982, On Conceptual Modelling.

[15]  Wolfgang Faber,et al.  Unfounded Sets for Disjunctive Logic Programs with Arbitrary Aggregates , 2005, LPNMR.

[16]  John Wylie Lloyd,et al.  Foundations of Logic Programming , 1987, Symbolic Computation.

[17]  Leopoldo E. Bertossi,et al.  Logic Programs for Querying Inconsistent Databases , 2003, PADL.

[18]  Jef Wijsen,et al.  Condensed Representation of Database Repairs for Consistent Query Answering , 2003, ICDT.

[19]  Wolfgang Faber,et al.  Magic Sets and their application to data integration , 2005, J. Comput. Syst. Sci..

[20]  Jan Chomicki,et al.  Minimal-change integrity maintenance using tuple deletions , 2002, Inf. Comput..

[21]  Alberto Mendelzon,et al.  Structurally heterogeneous olap dimensions , 2002 .

[22]  Andrea Calì,et al.  Query rewriting and answering under constraints in data integration systems , 2003, IJCAI.

[23]  Jan Chomicki,et al.  Consistent query answers in inconsistent databases , 1999, PODS '99.

[24]  Gottfried Vossen,et al.  Consistency in data warehouse dimensions , 2002, Proceedings International Database Engineering and Applications Symposium.

[25]  C. A. Johnson Top-Down Query Processing in First Order Deductive Databases under the DWFS , 2000, ISMIS.

[26]  Michael Gelfond,et al.  Logic programming and knowledge representation—The A-Prolog perspective , 2002 .

[27]  Sergio Greco,et al.  Binding Propagation Techniques for the Optimization of Bound Disjunctive Queries , 2003, IEEE Trans. Knowl. Data Eng..

[28]  David Scott Warren,et al.  Efficient Top-Down Computation of Queries under the Well-Founded Semantics , 1995, J. Log. Program..

[29]  Kenneth A. Ross,et al.  Foundations of Aggregation Constraints , 1994, Theor. Comput. Sci..

[30]  Rina Dechter,et al.  Propositional semantics for disjunctive logic programs , 1994, Annals of Mathematics and Artificial Intelligence.

[31]  Francesco Scarcello,et al.  Disjunctive Stable Models: Unfounded Sets, Fixpoint Semantics, and Computation , 1997, Inf. Comput..

[32]  Andrea Calì,et al.  On the decidability and complexity of query answering over inconsistent and incomplete databases , 2003, PODS.

[33]  Leopoldo E. Bertossi,et al.  Logic Programs for Consistently Querying Data Integration Systems , 2003, IJCAI.

[34]  Inderpal Singh Mumick,et al.  Selection of Views to Materialize Under a Maintenance Cost Constraint , 1999, ICDT.

[35]  Hector Garcia-Molina,et al.  Expiring Data in a Warehouse , 1998, VLDB.

[36]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[37]  Leopoldo E. Bertossi,et al.  Consistent query answering under inclusion dependencies , 2004, CASCON.

[38]  Xin He,et al.  Scalar aggregation in inconsistent databases , 2003, Theor. Comput. Sci..

[39]  Laks V. S. Lakshmanan,et al.  The Generalized MDL Approach for Summarization , 2002, VLDB.

[40]  Teodor C. Przymusinski Stable semantics for disjunctive programs , 1991, New Generation Computing.

[41]  Alon Y. Levy Logic-based techniques in data integration , 2001 .

[42]  Leopoldo E. Bertossi,et al.  Optimizing repair programs for consistent query answering , 2005, XXV International Conference of the Chilean Computer Science Society (SCCC'05).

[43]  Enrico Pontelli,et al.  On Logic Programming with Aggregates , 2006 .

[44]  Maurizio Lenzerini,et al.  Source inconsistency and incompleteness in data integration , 2002, KRDB.

[45]  Letizia Tanca,et al.  Logic Programming and Databases , 1990, Surveys in Computer Science.

[46]  Jan Chomicki,et al.  Scalar Aggregation in FD-Inconsistent Databases , 2001, ICDT.

[47]  Jan Chomicki,et al.  On the Computational Complexity of Consistent Query Answers , 2002, ArXiv.

[48]  Sergio Greco,et al.  A Logical Framework for Querying and Repairing Inconsistent Databases , 2003, IEEE Trans. Knowl. Data Eng..

[49]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[50]  Leopoldo E. Bertossi,et al.  Query Answering in Peer-to-Peer Data Exchange Systems , 2004, EDBT Workshops.

[51]  Wolfgang Faber,et al.  Enhancing the Magic-Set Method for Disjunctive Datalog Programs , 2004, ICLP.

[52]  Wolfgang Faber,et al.  Recursive Aggregates in Disjunctive Logic Programs: Semantics and Complexity , 2004, JELIA.

[53]  Wolfgang Faber,et al.  The DLV system for knowledge representation and reasoning , 2002, TOCL.

[54]  Jan Chomicki,et al.  Query Answering in Inconsistent Databases , 2003, Logics for Emerging Applications of Databases.

[55]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[56]  Alberto O. Mendelzon,et al.  Maintaining data cubes under dimension updates , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[57]  Alberto O. Mendelzon,et al.  Updating OLAP dimensions , 1999, DOLAP '99.

[58]  Konstantinos Sagonas,et al.  XSB as an efficient deductive database engine , 1994, SIGMOD '94.

[59]  Georg Gottlob,et al.  Complexity and expressive power of logic programming , 1997, Proceedings of Computational Complexity. Twelfth Annual IEEE Conference.

[60]  Leopoldo E. Bertossi,et al.  Semantically Correct Query Answers in the Presence of Null Values , 2006, EDBT Workshops.

[61]  Wolfgang Faber,et al.  Pruning Operators for Disjunctive Logic Programming Systems , 2006, Fundam. Informaticae.

[62]  Sergio Greco,et al.  Optimization of bound disjunctive queries with constraints , 2004, Theory and Practice of Logic Programming.

[63]  Renée J. Miller,et al.  First-order query rewriting for inconsistent databases , 2005, J. Comput. Syst. Sci..

[64]  Torben Bach Pedersen,et al.  Extending Practical Pre-Aggregation in On-Line Analytical Processing , 1999, VLDB.

[65]  Kenneth A. Ross,et al.  Modular stratification and magic sets for Datalog programs with negation , 1994, JACM.

[66]  Catriel Beeri,et al.  On the power of magic , 1987, J. Log. Program..

[67]  Jef Wijsen,et al.  Project-Join-Repair: An Approach to Consistent Query Answering Under Functional Dependencies , 2006, FQAS.

[68]  Vladimir Lifschitz,et al.  Splitting a Logic Program , 1994, ICLP.

[69]  Georg Gottlob,et al.  Disjunctive datalog , 1997, TODS.

[70]  Teodor C. Przymusinski On the Declarative Semantics of Deductive Databases and Logic Programs , 1988, Foundations of Deductive Databases and Logic Programming..

[71]  Leopoldo E. Bertossi,et al.  Deductive databases for computing certain and consistent answers from mediated data integration systems , 2005, J. Appl. Log..

[72]  Kenneth A. Ross,et al.  Unfounded sets and well-founded semantics for general logic programs , 1988, PODS.

[73]  Teodor C. Przymusinski The Well-Founded Semantics Coincides with the Three-Valued Stable Semantics , 1990, Fundam. Inform..

[74]  Wolfgang Faber,et al.  The INFOMIX system for advanced integration of incomplete and inconsistent data , 2005, SIGMOD '05.