On reconciling data exchange, data integration, and peer data management

Data exchange and virtual data integration have been the subject of several investigations in the recent literature. At the same time, the notion of peer data management has emerged as a powerful abstraction of many forms of flexible and dynamic data-centere ddistributed systems. Although research on the above issues has progressed considerably in the last years, a clear understanding on how to combine data exchange and data integration in peer data management is still missing. This is the subject of the present paper. We start our investigation by first proposing a novel framework for peer data exchange, showing that it is a generalization of the classical data exchange setting. We also present algorithms for all the relevant data exchange tasks, and show that they can all be done in polynomial time with respect to data complexity. Based on the motivation that typical mappings and integrity constraints found in data integration are not captured by peer data exchange, we extend the framework to incorporate these features. One of the main difficulties is that the constraints of this new class are not amenable to materialization. We address this issue by resorting to a suitable combination of virtual and materialized data exchange, showing that the resulting framework is a generalization of both classical data exchange and classical data integration, and that the new setting incorporates the most expressive types of mapping and constraints considered in the two contexts. Finally, we present algorithms for all the relevant data management tasks also in the new setting, and show that, again, their data complexity is polynomial.

[1]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[2]  Fausto Giunchiglia,et al.  Data Management for Peer-to-Peer Computing : A Vision , 2002, WebDB.

[3]  Dan Suciu,et al.  What Can Database Do for Peer-to-Peer? , 2001, WebDB.

[4]  Andrea Calì,et al.  Query rewriting and answering under constraints in data integration systems , 2003, IJCAI.

[5]  Diego Calvanese,et al.  Logical foundations of peer-to-peer data integration , 2004, PODS '04.

[6]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[7]  Dan Suciu,et al.  Schema mediation in peer data management systems , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[8]  Andrea Calì,et al.  On the decidability and complexity of query answering over inconsistent and incomplete databases , 2003, PODS.

[9]  Leopoldo E. Bertossi,et al.  Logic Programs for Consistently Querying Data Integration Systems , 2003, IJCAI.

[10]  Diego Calvanese,et al.  Inconsistency Tolerance in P2P Data Integration: An Epistemic Logic Approach , 2005, DBPL.

[11]  Georg Gottlob,et al.  Computing cores for data exchange: new algorithms and practical solutions , 2005, PODS '05.

[12]  Jayant Madhavan,et al.  Composing Mappings Among Data Sources , 2003, VLDB.

[13]  Marcelo Arenas,et al.  XML data exchange: consistency and query answering , 2005, PODS '05.

[14]  Ronald Fagin,et al.  Composing schema mappings: second-order dependencies to the rescue , 2004, PODS '04.

[15]  Phokion G. Kolaitis,et al.  Peer data exchange , 2005, PODS '05.

[16]  Leonid Libkin,et al.  Data exchange and incomplete information , 2006, PODS '06.

[17]  Philippe Chatalic,et al.  Reasoning with Inconsistencies in Propositional Peer-to-Peer Inference Systems , 2006, ECAI.

[18]  Serge Abiteboul,et al.  Complexity of answering queries using materialized views , 1998, PODS.

[19]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[20]  Phokion G. Kolaitis,et al.  The complexity of data exchange , 2006, PODS '06.

[21]  Alon Y. Halevy,et al.  Efficient query reformulation in peer data management systems , 2004, SIGMOD '04.

[22]  Verena Kantere,et al.  The hyperion project: from data integration to data coordination , 2003, SGMD.

[23]  Ronald Fagin,et al.  Locally consistent transformations and query answering in data exchange , 2004, PODS '04.

[24]  Georg Gottlob,et al.  Data exchange: computing cores in polynomial time , 2006, PODS '06.

[25]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2003, Theor. Comput. Sci..

[26]  Joann J. Ordille,et al.  Data integration: the teenage years , 2006, VLDB.

[27]  Phokion G. Kolaitis Schema mappings, data exchange, and metadata management , 2005, PODS '05.

[28]  Gabriel M. Kuper,et al.  A Robust Logical and Computational Characterisation of Peer-to-Peer Database Systems , 2003, DBISP2P.

[29]  Riccardo Rosati On the decidability and finite controllability of query processing in databases with incomplete information , 2006, PODS '06.

[30]  Andrea Calì,et al.  Data integration under integrity constraints , 2004, Inf. Syst..