Handling inconsistency in databases and data integration systems

For several reasons a database may not satisfy certain integrity constraints (ICs), for example, when it is the result of integrating several independent data sources. However, most likely, information in it is still consistent with the ICs; and could be retrieved when queries are answered. Consistent answers with respect to a set of ICs have been characterized as answers that can be obtained from every possible minimal repair of the database. The goal of this research is to develop methods to retrieve consistent answers for a wide and practical class of constraints and queries from relational databases and from data integration systems. We will put special interest on databases with null values. We will give a semantics of satisfaction of constraints in the presence of null that generalizes the one used in commercial DBMS. Since there are interesting connections between the area of consistently querying virtual data integration systems and other areas, like querying incomplete databases, merging inconsistent theories, semantic reconciliation of data, schema mapping, data exchange, and query answering in peer data management systems, the results of this research could also be applied to them. In our research, we explore in more depth the connection with virtual data integration systems and peer data management systems.

[1]  Diego Calvanese,et al.  Semantic Data Integration in P2P Systems , 2003, DBISP2P.

[2]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[3]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2003, Theor. Comput. Sci..

[4]  Todd D. Millstein,et al.  Query containment for data integration systems , 2000, PODS '00.

[5]  M. Darnovsky,et al.  TRANSACT-SQL USER''S GUIDE , 1987 .

[6]  John C. Reynolds,et al.  School of Computer Science , 1992 .

[7]  Serge Abiteboul,et al.  Complexity of answering queries using materialized views , 1998, PODS.

[8]  Carlo Zaniolo,et al.  Non-Determinism in Deductive Databases , 1991, DOOD.

[9]  Inderpal Singh Mumick,et al.  The Stanford Data Warehousing Project , 1995 .

[10]  Leopoldo E. Bertossi,et al.  Semantically Correct Query Answers in the Presence of Null Values , 2006, EDBT Workshops.

[11]  Wolfgang Faber,et al.  Declarative problem-solving in DLV , 2001 .

[12]  Sarit Kraus,et al.  Combining Knowledge Bases Consisting of First Order Theories , 1991, ISMIS.

[13]  Jan Chomicki,et al.  On the Computational Complexity of Minimal-Change Integrity Maintenance in Relational Databases , 2005, Inconsistency Tolerance.

[14]  Jan Chomicki,et al.  Answer sets for consistent query answering in inconsistent databases , 2002, Theory and Practice of Logic Programming.

[15]  C. J. Date,et al.  Relational database writings: 1985-1989 , 1990 .

[16]  Georg Gottlob,et al.  Complexity and expressive power of logic programming , 2001, CSUR.

[17]  Maurizio Lenzerini,et al.  Source inconsistency and incompleteness in data integration , 2002, KRDB.

[18]  Johann Baumeister,et al.  SQL Server 2005 , 2006, Datenbank-Spektrum.

[19]  Andrea Calì,et al.  On the Expressive Power of Data Integration Systems , 2002, ER.

[20]  Michael Gelfond,et al.  Classical negation in logic programs and disjunctive databases , 1991, New Generation Computing.

[21]  Raymond Reiter,et al.  Towards a Logical Reconstruction of Relational Database Theory , 1982, On Conceptual Modelling.

[22]  Carlo Zaniolo Database relations with null values , 1982, PODS '82.

[23]  Gösta Grahne,et al.  Information Integration and Incomplete Information , 2002, IEEE Data Eng. Bull..

[24]  Rodney W. Topor,et al.  Safety and correct translation of relational calculus formulas , 1987, PODS '87.

[25]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[26]  Gunter Saake,et al.  Logics for Emerging Applications of Databases , 2003, Springer Berlin Heidelberg.

[27]  Francesco Buccafurri,et al.  Enhancing Disjunctive Datalog by Constraints , 2000, IEEE Trans. Knowl. Data Eng..

[28]  Jan Chomicki,et al.  Minimal-change integrity maintenance using tuple deletions , 2002, Inf. Comput..

[29]  Sergio Greco,et al.  Programming with non-determinism in deductive databases , 2004, Annals of Mathematics and Artificial Intelligence.

[30]  Divesh Srivastava,et al.  Answering Queries Using Views. , 1999, PODS 1995.

[31]  Jeffrey D. Uuman Principles of database and knowledge- base systems , 1989 .

[32]  Diego Calvanese,et al.  Logical foundations of peer-to-peer data integration , 2004, PODS '04.

[33]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[34]  Peter Buneman,et al.  Semistructured data , 1997, PODS.

[35]  Jennifer Widom,et al.  A First Course in Database Systems , 1997 .

[36]  Diego Calvanese,et al.  View-based query containment , 2003, PODS '03.

[37]  Leopoldo E. Bertossi,et al.  Consistent query answering under inclusion dependencies , 2004, CASCON.

[38]  Leopoldo E. Bertossi,et al.  Querying Inconsistent Databases: Algorithms and Implementation , 2000, Computational Logic.

[39]  John Grant,et al.  Null Values in a Relational Data Base , 1977, Inf. Process. Lett..

[40]  Alberto O. Mendelzon,et al.  Merging Databases Under Constraints , 1998, Int. J. Cooperative Inf. Syst..

[41]  Julius T. Tou,et al.  Information Systems , 1973, GI Jahrestagung.

[42]  Phokion G. Kolaitis,et al.  The complexity of data exchange , 2006, PODS '06.

[43]  Paolo Atzeni,et al.  Functional Dependencies in Relations with Null Values , 1984, Inf. Process. Lett..

[44]  Michael Kifer,et al.  Applications of Annotated Predicate Calculus to Querying Inconsistent Databases , 2000, Computational Logic.

[45]  Hector Garcia-Molina,et al.  Designing a super-peer network , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[46]  Renée J. Miller,et al.  ConQuer: efficient management of inconsistent databases , 2005, SIGMOD '05.

[47]  Monica Caniupan,et al.  Optimizing and implementing repair programs for consistent query answering in databases , 2007 .

[48]  Teodor C. Przymusinski Stable semantics for disjunctive programs , 1991, New Generation Computing.

[49]  Renée J. Miller,et al.  Mapping data in peer-to-peer systems: semantics and algorithmic issues , 2003, SIGMOD '03.

[50]  Jarek Gryz,et al.  Query Rewriting Using Views in the Presence of Functional and Inclusion Dependencies , 1999, Inf. Syst..

[51]  Andrea Calì,et al.  Query rewriting and answering under constraints in data integration systems , 2003, IJCAI.

[52]  Thomas Eiter,et al.  Efficient Evaluation of Logic Programs for Querying Data Integration Systems , 2003, ICLP.

[53]  Rina Dechter,et al.  Propositional semantics for disjunctive logic programs , 1994, Annals of Mathematics and Artificial Intelligence.

[54]  Leonid Libkin,et al.  A Semantics-based Approach to Design of Query Languages for Partial Information , 1995, Semantics in Databases.

[55]  Y. Edmund Lien,et al.  On the Equivalence of Database Models , 1982, JACM.

[56]  Jan Chomicki,et al.  Consistent Answers from Integrated Data Sources , 2002, FQAS.

[57]  Gerhard Weikum,et al.  ACM Transactions on Database Systems , 2005 .

[58]  Carlo Zaniolo,et al.  Nonmonotonic reasoning in LDL , 2000 .

[59]  Mark Levene,et al.  Database design for incomplete relations , 1999, TODS.

[60]  Alon Y. Halevy,et al.  Theory of answering queries using views , 2000, SGMD.

[61]  Diego Calvanese,et al.  Inconsistency Tolerance in P2P Data Integration: An Epistemic Logic Approach , 2005, DBPL.

[62]  Pedro M. Domingos,et al.  Learning to Match the Schemas of Data Sources: A Multistrategy Approach , 2003, Machine Learning.

[63]  John Grant,et al.  Incomplete Information in a Relational Database , 1980, Fundamenta Informaticae.

[64]  Alexandra Poulovassilis,et al.  Data integration by bi-directional schema transformation rules , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[65]  Jef Wijsen,et al.  Database repairing using updates , 2005, TODS.

[66]  Andrea Calì,et al.  On the decidability and complexity of query answering over inconsistent and incomplete databases , 2003, PODS.

[67]  Rodney W. Topor,et al.  Safety and translation of relational calculus , 1991, TODS.

[68]  Leopoldo E. Bertossi,et al.  Logic Programs for Consistently Querying Data Integration Systems , 2003, IJCAI.

[69]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[70]  Mark Levene,et al.  Null Inclusion Dependencies in Relational Databases , 1997, Inf. Comput..

[71]  Alberto O. Mendelzon,et al.  Tableau Techniques for Querying Information Sources through Global Schemas , 1999, ICDT.

[72]  Leopoldo E. Bertossi,et al.  Query Answering in Peer-to-Peer Data Exchange Systems , 2004, EDBT Workshops.

[73]  Phokion G. Kolaitis,et al.  Conjunctive-query containment and constraint satisfaction , 1998, PODS.

[74]  Gabriel M. Kuper,et al.  A Robust Logical and Computational Characterisation of Peer-to-Peer Database Systems , 2003, DBISP2P.

[75]  Paolo Atzeni,et al.  Functional Dependencies and Constraints on Null Values in Database Relations , 1986, Inf. Control..

[76]  Erhard Rahm,et al.  A survey of approaches to automatic schema matching , 2001, The VLDB Journal.

[77]  Richard Hull,et al.  Managing semantic heterogeneity in databases: a theoretical prospective , 1997, PODS.

[78]  Leopoldo E. Bertossi,et al.  Characterizing and Computing Semantically Correct Answers from Databases with Annotated Logic and Answer Sets , 2001, Semantics in Databases.

[79]  Leopoldo E. Bertossi,et al.  Logic Programs for Querying Inconsistent Databases , 2003, PADL.

[80]  Qiming Chen,et al.  International Journal of Cooperative Information Systems , 1999 .

[81]  Andrea Calì,et al.  Data integration under integrity constraints , 2004, Inf. Syst..

[82]  J. Lloyd Foundations of Logic Programming , 1984, Symbolic Computation.

[83]  Loreto Bravo,et al.  Efficient Approximation Algorithms for Repairing Inconsistent Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[84]  Leopoldo E. Bertossi,et al.  Optimizing repair programs for consistent query answering , 2005, XXV International Conference of the Chilean Computer Science Society (SCCC'05).

[85]  Renée J. Miller,et al.  First-order query rewriting for inconsistent databases , 2005, J. Comput. Syst. Sci..

[86]  J. Davenport Editor , 1960 .

[87]  Chen Li,et al.  Rewriting Queries using Views , 2009, Encyclopedia of Database Systems.

[88]  Gabriel M. Kuper,et al.  A Distributed Algorithm for Robust Data Sharing and Updates in P2P Database Networks , 2004, EDBT Workshops.

[89]  Renée J. Miller,et al.  Towards Inconsistency Management in Data Integration Systems , 2003, IIWeb.

[90]  Leopoldo E. Bertossi,et al.  Complexity of Consistent Query Answering in Databases Under Cardinality-Based and Incremental Repair Semantics , 2006, ICDT.

[91]  Ashish Gupta,et al.  Materialized views: techniques, implementations, and applications , 1999 .

[92]  Jayant Madhavan,et al.  Corpus-Based Knowledge Representation , 2003, IJCAI.

[93]  Phokion G. Kolaitis,et al.  Peer data exchange , 2005, PODS '05.

[94]  Rynson W. H. Lau,et al.  Knowledge and Data Engineering for e-Learning Special Issue of IEEE Transactions on Knowledge and Data Engineering , 2008 .

[95]  共立出版株式会社 コンピュータ・サイエンス : ACM computing surveys , 1978 .

[96]  Sergio Greco,et al.  A Logic Programming Approach to the Integration, Repairing and Querying of Inconsistent Databases , 2001, ICLP.

[97]  Jan Chomicki,et al.  Consistent query answers in inconsistent databases , 1999, PODS '99.

[98]  Peter Buneman,et al.  Using Powerdomains to Generalize Relational Databases , 1991, Theor. Comput. Sci..

[99]  Catriel Beeri,et al.  A Proof Procedure for Data Dependencies , 1984, JACM.

[100]  Oscar H. IBARm Information and Control , 1957, Nature.

[101]  Jeffrey D. Ullman,et al.  Information integration using logical views , 1997, Theor. Comput. Sci..

[102]  Gösta Grahne,et al.  The Problem of Incomplete Information in Relational Databases , 1991, Lecture Notes in Computer Science.

[103]  Leopoldo E. Bertossi,et al.  Data Cleansing for Numerical Data Sets , 2005, SEBD.

[104]  Mark Levene,et al.  A guided tour of relational databases and beyond , 1999 .

[105]  Michael Gelfond,et al.  Logic programming and knowledge representation—The A-Prolog perspective , 2002 .

[106]  Jeffrey D. Ullman,et al.  Principles of Database and Knowledge-Base Systems, Volume II , 1988, Principles of computer science series.

[107]  Leopoldo E. Bertossi,et al.  Consistent Query Answers in Virtual Data Integration Systems , 2005, Inconsistency Tolerance.

[108]  Wolfgang Faber,et al.  The DLV system for knowledge representation and reasoning , 2002, TOCL.

[109]  Jan Chomicki,et al.  Query Answering in Inconsistent Databases , 2003, Logics for Emerging Applications of Databases.

[110]  John Grant,et al.  A logic-based approach to data integration , 2001, Theory and Practice of Logic Programming.

[111]  Michael R. Genesereth,et al.  Query planning and optimization in information integration , 1997 .

[112]  Raymond Reiter,et al.  A sound and sometimes complete query evaluation algorithm for relational databases with null values , 1986, JACM.

[113]  Y. Edmund Lien Multivalued Dependencies With Null Values In Relational Data Bases , 1979, Fifth International Conference on Very Large Data Bases, 1979..

[114]  J. van Leeuwen,et al.  Theoretical Computer Science , 2003, Lecture Notes in Computer Science.

[115]  Michael Gertz,et al.  Semantic integrity support in SQL:1999 and commercial (object-)relational database management systems , 2001, The VLDB Journal.

[116]  Ron van der Meyden,et al.  Logical Approaches to Incomplete Information: A Survey , 1998, Logics for Databases and Information Systems.

[117]  Dan Suciu,et al.  The Piazza peer data management system , 2004, IEEE Transactions on Knowledge and Data Engineering.

[118]  Jef Wijsen,et al.  Condensed Representation of Database Repairs for Consistent Query Answering , 2003, ICDT.

[119]  Mark Levene,et al.  The additivity problem for functional dependencies in incomplete relations , 1997, Acta Informatica.

[120]  Dan Suciu,et al.  Journal of the ACM , 2006 .

[121]  E. F. Codd,et al.  Extending the database relational model to capture more meaning , 1979, ACM Trans. Database Syst..

[122]  R. Lathe Phd by thesis , 1988, Nature.

[123]  Leopoldo E. Bertossi,et al.  Complexity and Approximation of Fixing Numerical Attributes in Databases Under Integrity Constraints , 2005, DBPL.

[124]  Grigoris Antoniou,et al.  Nonmonotonic reasoning , 1997 .

[125]  Leopoldo E. Bertossi,et al.  Consistent query answering in databases , 2006, SGMD.

[126]  Mukesh Dalal,et al.  Investigations into a Theory of Knowledge Base Revision , 1988, AAAI.

[127]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[128]  Leopoldo E. Bertossi,et al.  Deductive databases for computing certain and consistent answers from mediated data integration systems , 2005, J. Appl. Log..

[129]  Jan Chomicki,et al.  Specifying and Querying Database Repairs using Logic Programs with Exceptions , 2000, FQAS.

[130]  Leopoldo E. Bertossi,et al.  Fixing inconsistent databases by updating numerical attributes , 2005, 16th International Workshop on Database and Expert Systems Applications (DEXA'05).

[131]  Leonid Libkin A Relational Algebra for Complex Objects Based on Partial Information , 1991, MFDBS.

[132]  Francesco Scarcello,et al.  Disjunctive Stable Models: Unfounded Sets, Fixpoint Semantics, and Computation , 1997, Inf. Comput..

[133]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[134]  Leopoldo E. Bertossi,et al.  Repairing databases with annotated predicate logic , 2002, NMR.

[135]  Todd D. Millstein,et al.  Navigational Plans For Data Integration , 1999, AAAI/IAAI.

[136]  Vldb Endowment,et al.  The VLDB journal : the international journal on very large data bases. , 1992 .

[137]  Renée J. Miller,et al.  ConQuer: A System for Efficient Querying Over Inconsistent Databases , 2005, VLDB.

[138]  Alon Y. Halevy,et al.  Recursive Query Plans for Data Integration , 2000, J. Log. Program..

[139]  Chitta Baral,et al.  Knowledge Representation, Reasoning and Declarative Problem Solving , 2003 .

[140]  Alon Y. Levy Logic-based techniques in data integration , 2001 .

[141]  MONICA TENTORI,et al.  Collaboration and Coordination in Hospital Work through Activity-Aware Computing , 2008, Int. J. Cooperative Inf. Syst..

[142]  Leopoldo E. Bertossi,et al.  Consistent Query Answering By Minimal-Size Repairs , 2006, 17th International Workshop on Database and Expert Systems Applications (DEXA'06).

[143]  Vladimir Lifschitz,et al.  Splitting a Logic Program , 1994, ICLP.