Pattern-Based Schema Mapping and Query Answering in Peer-to-Peer XML Data Integration System

This chapter addresses the problem of data integration in a P2P environment, where each peer stores schema of its local data, mappings between the schemas, and some schema constraints. The goal of the integration is to answer queries formulated against a chosen peer. The answer must consist of data stored in the queried peer as well as data of its direct and indirect partners. The chapter focuses on defining and using mappings, schema constraints, query propagation across the P2P system, and query answering in such scenario. Schemas, mappings, constraints (functional dependencies) and queries are all expressed using a unified approach based on tree-pattern formulas. The chapter discusses how functional dependencies can be exploited to increase information content of answers (by discovering missing values) and to control merging operations and propagation strategies. The chapter proposes algorithms for translating high-level specifications of mappings and queries into XQuery programs, and it shows how the discussed method has been implemented in SixP2P (or 6P2P) system.

[1]  Laura M. Haas,et al.  Schema Mapping as Query Discovery , 2000, VLDB.

[2]  Alon Y. Halevy,et al.  Efficient query reformulation in peer data management systems , 2004, SIGMOD '04.

[3]  Ioana Manolescu,et al.  Active XML: Peer-to-Peer Data and Web Services Integration , 2002, VLDB.

[4]  Marcelo Arenas,et al.  A normal form for XML documents , 2004, TODS.

[5]  Z. Meral Özsoyoglu,et al.  Rewriting XPath Queries Using Materialized Views , 2005, VLDB.

[6]  Tadeusz Pankowski,et al.  Data Merging in Life Science Data Integration Systems , 2005, Intelligent Information Systems.

[7]  Tadeusz Pankowski,et al.  Reconciling Inconsistent Data in Probabilistic XML Data Integration , 2008, BNCOD.

[8]  Tadeusz Pankowski XML data integration in SixP2P: a theoretical framework , 2008, DaMaP '08.

[9]  Ronald Fagin,et al.  Composing schema mappings: second-order dependencies to the rescue , 2004, PODS 2004.

[10]  Tadeusz Pankowski XML Schema Mappings Using Schema Constraints and Skolem Functions , 2008 .

[11]  Erhard Rahm,et al.  Supporting executable mappings in model management , 2005, SIGMOD '05.

[12]  Ronald Fagin,et al.  Data exchange: getting to the core , 2003, PODS '03.

[13]  Jayant Madhavan,et al.  Composing Mappings Among Data Sources , 2003, VLDB.

[14]  Evaggelia Pitoura,et al.  Peer-to-peer management of XML data: issues and research challenges , 2005, SGMD.

[15]  Wenfei Fan,et al.  Reasoning about keys for XML , 2003, Inf. Syst..

[16]  Phokion G. Kolaitis,et al.  Peer data exchange , 2005, PODS '05.

[17]  Beng Chin Ooi,et al.  Relational data sharing in peer-based data management systems , 2003, SGMD.

[18]  Laura M. Haas,et al.  Information integration in the enterprise , 2008, CACM.

[19]  Cong Yu,et al.  Constraint-based XML query rewriting for data integration , 2004, SIGMOD '04.

[20]  Tadeusz Pankowski,et al.  XML Schema Mappings in the Presence of Key Constraints and Value Dependencies , 2007, EROW.

[21]  Serge Abiteboul,et al.  Exchanging intensional XML data , 2003, TODS.

[22]  Thomas Schwentick,et al.  Simple off the shelf abstractions for XML schema , 2007, SGMD.

[23]  Dan Suciu,et al.  The Piazza peer data management project , 2003, SGMD.

[24]  Marcelo Arenas,et al.  XML data exchange: consistency and query answering , 2005, PODS '05.

[25]  Tadeusz Pankowski,et al.  Schema Mappings and Agents' Actions in P2P Data Integration System , 2008, J. Univers. Comput. Sci..

[26]  Tadeusz Pankowski Query Propagation in a P2P Data Integration System in the Presence of Schema Constraints , 2008, Globe.