GrouPeer: A System for Clustering PDMSs

Sharing structured data in a PDMS is hard due to schema heterogeneity and peer autonomy. To overcome heterogeneity, peer databases employ mappings that partially match local information to that of their direct neighbors. Traditionally, a query is successively rewritten along the propagation path on each peer. This results in gradual query degradation and the inability to retrieve data pertinent to the original version, even from peers that store such data. This demonstration presents GrouPeer, a system that overcomes the query degradation problem and enables the dynamic clustering of the overlay according to the semantics of the peer data, utilizing normal query traffic. Peers are provided with a methodology that allows them to choose which rewritten version of a query to answer and discover remote information-rich sources. The demonstration illustrates the functionalities in the clustering mechanism of GrouPeer: approximate query rewriting, query similarity methodology, construction of new mappings, iterative learning process, employment of automatic schema matching, and proves the capability of the system to perform gradual semantic clustering and enable high quality answers to peer queries.