Hyper: A Framework for Peer-to-Peer Data Integration on Grids

Data Grids allow for seeing heterogeneous, distributed, and dynamic informational resources as if they were a uniform, stable, secure, and reliable database. According to this view, current proposals for data integration on Grids are based on the notion of global schema built over a collection of autonomous information sources. On the other hand, in dynamic and distributed environments, such a hierarchical and centralized architecture is not well suited for effective information integration. Peer-to-peer data integration aims at overcoming these drawbacks by modeling autonomous information systems as peers, and establishing mappings among peers without resorting to any hierarchical structure. In this paper, we present Hyper, a joint research initiative of Universita di Roma “La Sapienza” and IBM Italia, which aims at developing principles and techniques for peer-to-peer data integration on a Grid infrastructure. The main contributions presented are a semantic characterization of P2P data integration, the deployment of our P2P framework on a Grid architecture, and the design of a query answering algorithm that is coherent both with the semantics and with the Grid infrastructure.

[1]  L. M. CAMARINHA-MATOS,et al.  Towards an architecture for virtual enterprises , 1998, J. Intell. Manuf..

[2]  Hector Garcia-Molina,et al.  Semantic Overlay Networks for P2P Systems , 2004, AP2PC.

[3]  Hector Garcia-Molina,et al.  Distributed Databases , 1995, Encyclopedia of GIS.

[4]  Steven Tuecke,et al.  The Physiology of the Grid An Open Grid Services Architecture for Distributed Systems Integration , 2002 .

[5]  Alon Y. Halevy,et al.  Piazza: data management infrastructure for semantic web applications , 2003, WWW '03.

[6]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[7]  Diego Calvanese,et al.  Logical foundations of peer-to-peer data integration , 2004, PODS '04.

[8]  Ian T. Foster,et al.  The anatomy of the grid: enabling scalable virtual organizations , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[9]  Karl Aberer,et al.  GridVine: Building Internet-Scale Semantic Overlay Networks , 2004, SEMWEB.

[10]  Mario Piattini,et al.  Advanced Database Technology and Design , 2000 .

[11]  Gerhard Lakemeyer,et al.  The logic of knowledge bases , 2000 .

[12]  Roger King,et al.  Using Object Matching and Materialization to Integrate Heterogeneous Databases , 1999, CoopIS.

[13]  Ian T. Foster,et al.  Grid Services for Distributed System Integration , 2002, Computer.

[14]  N. Paton,et al.  Database Access and Integration Services on the Grid , 2002 .

[15]  Jiawei Han,et al.  Profile-Based Object Matching for Information Integration , 2003, IEEE Intell. Syst..

[16]  Todd D. Millstein,et al.  Navigational Plans For Data Integration , 1999, AAAI/IAAI.

[17]  Maurizio Lenzerini,et al.  Data integration: a theoretical perspective , 2002, PODS.

[18]  Donald F. Ferguson,et al.  The WS-Resource Framework , 2004 .