Data exchange beyond complete data

In the traditional data exchange setting, source instances are restricted to be complete in the sense that every fact is either true or false in these instances. Although natural for a typical database translation scenario, this restriction is gradually becoming an impediment to the development of a wide range of applications that need to exchange objects that admit several interpretations. In particular, we are motivated by two specific applications that go beyond the usual data exchange scenario: exchanging incomplete information and exchanging knowledge bases. In this paper, we propose a general framework for data exchange that can deal with these two applications. More specifically, we address the problem of exchanging information given by representation systems, which are essentially finite descriptions of (possibly infinite) sets of complete instances. We make use of the classical semantics of mappings specified by sets of logical sentences to give a meaningful semantics to the notion of exchanging representatives, from which the standard notions of solution, space of solutions, and universal solution naturally arise. We also introduce the notion of strong representation system for a class of mappings, that resembles the concept of strong representation system for a query language. We show the robustness of our proposal by applying it to the two applications mentioned above: exchanging incomplete information and exchanging knowledge bases, which are both instantiations of the exchanging problem for representation systems. We study these two applications in detail, presenting results regarding expressiveness, query answering and complexity of computing solutions, and also algorithms to materialize solutions.

[1]  Ronald Fagin Inverting schema mappings , 2007 .

[2]  Klaus W. Wagner,et al.  More Complicated Questions About Maxima and Minima, and Some Closures of NP , 1986, Theor. Comput. Sci..

[3]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2005, Theor. Comput. Sci..

[4]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[5]  Philip A. Bernstein,et al.  Composition of mappings given by embedded dependencies , 2005, PODS '05.

[6]  Phokion G. Kolaitis,et al.  Structural characterizations of schema-mapping languages , 2009, ICDT '09.

[7]  Klaus W. Wagner More Complicated Questions About Maxima and Minima, and Some Closures of NP , 1987, Theor. Comput. Sci..

[8]  Moshe Y. Vardi The complexity of relational query languages (Extended Abstract) , 1982, STOC '82.

[9]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[10]  Val Tannen,et al.  Provenance semirings , 2007, PODS.

[11]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[12]  Dan Olteanu,et al.  $${10^{(10^{6})}}$$ worlds and beyond: efficient representation and processing of incomplete information , 2006, 2007 IEEE 23rd International Conference on Data Engineering.

[13]  Dan Suciu,et al.  Journal of the ACM , 2006 .

[14]  Marcelo Arenas,et al.  Foundations of schema mapping management , 2010, PODS '10.

[15]  Cristina Sirangelo,et al.  Data exchange and schema mappings in open and closed worlds , 2011, J. Comput. Syst. Sci..

[16]  Chen Li,et al.  Data exchange: query answering for incomplete data sources , 2008, Infoscale.

[17]  Leonid Libkin,et al.  Data exchange and incomplete information , 2006, PODS '06.

[18]  Phokion G. Kolaitis,et al.  Structural characterizations of schema-mapping languages , 2010 .

[19]  David Maier,et al.  Testing implications of data dependencies , 1979, SIGMOD '79.

[20]  Diego Calvanese,et al.  Representability in DL-Lite_R Knowledge Base Exchange , 2012, Description Logics.

[21]  Ronald Fagin,et al.  Composing schema mappings: second-order dependencies to the rescue , 2004, PODS 2004.

[22]  Philip A. Bernstein,et al.  Model management 2.0: manipulating richer mappings , 2007, SIGMOD '07.

[23]  Ronald Fagin,et al.  Composing schema mappings: Second-order dependencies to the rescue , 2005, TODS.

[24]  Anuj Dawar,et al.  A Restricted Second Order Logic for Finite Structures , 1994, LCC.

[25]  Philip A. Bernstein,et al.  Applying Model Management to Classical Meta Data Problems , 2003, CIDR.

[26]  Georg Gottlob,et al.  On the complexity of propositional knowledge base revision, updates, and counterfactuals , 1992, Artif. Intell..

[27]  Diego Calvanese,et al.  Exchanging Description Logic Knowledge Bases , 2012, KR.

[28]  Samuel R. Buss,et al.  On Truth-Table Reducibility to SAT , 1991, Inf. Comput..

[29]  Catriel Beeri,et al.  A Proof Procedure for Data Dependencies , 1984, JACM.

[30]  Serge Abiteboul,et al.  On the Representation and Querying of Sets of Possible Worlds , 1991, Theor. Comput. Sci..

[31]  Jeffrey D. Ullman,et al.  Information integration using logical views , 1997, Theor. Comput. Sci..

[32]  Gösta Grahne,et al.  The Problem of Incomplete Information in Relational Databases , 1991, Lecture Notes in Computer Science.

[33]  Marcelo Arenas,et al.  The language of plain SO-tgds: Composition, inversion and structural properties , 2013, J. Comput. Syst. Sci..

[34]  Phokion G. Kolaitis,et al.  The complexity of data exchange , 2006, PODS '06.

[35]  Georg Gottlob,et al.  Complexity of Propositional Knowledge Base Revision , 1992, CNKBS.

[36]  Dan Olteanu,et al.  Efficient Representation and Processing of Incomplete Information , 2006 .

[37]  Diego Calvanese,et al.  Knowledge Base Exchange , 2016, Description Logics.

[38]  Alin Deutsch,et al.  Reformulation of XML Queries and Constraints , 2003, ICDT.

[39]  Marcelo Arenas,et al.  Inverting Schema Mappings: Bridging the Gap between Theory and Practice , 2009, Proc. VLDB Endow..

[40]  Dan Olteanu,et al.  10106 Worlds and Beyond: Efficient Representation and Processing of Incomplete Information , 2007, ICDE.

[41]  Marcelo Arenas,et al.  Data exchange beyond complete data , 2013 .

[42]  Ronald Fagin,et al.  Reverse data exchange: Coping with nulls , 2009, TODS.

[43]  Thomas Eiter,et al.  Preferred Answer Sets for Extended Logic Programs , 1999, Artif. Intell..

[44]  Adrian Onet,et al.  Closed world chasing , 2011, LID '11.

[45]  T. Allen Thank you. , 2003, CJEM.

[46]  Klaus W. Wagner,et al.  Bounded Query Classes , 1990, SIAM J. Comput..

[47]  Yehoshua Sagiv,et al.  Optimizing datalog programs , 1987, Foundations of Deductive Databases and Logic Programming..

[48]  Anuj Dawar A Restricted Second Order Logic for Finite Structures , 1998, Inf. Comput..

[49]  Ronald Fagin,et al.  Quasi-inverses of schema mappings , 2007, PODS '07.

[50]  Letizia Tanca,et al.  What you Always Wanted to Know About Datalog (And Never Dared to Ask) , 1989, IEEE Trans. Knowl. Data Eng..