Reliable SPARQL queries with consistent results over P2P-shared RDF storage

One aim of the semantic web is to build large knowledge bases distributed over the internet. Knowledge man- agement systems that gather, merge and make available the information physically stored in multiple locations suffer from consistency and data fragmentation issues due to node failures. In this paper we address such problems and we present an architecture for managing reliable SPARQL queries with consistent results over a P2P-shared RDF storage. The RDF-storage is composed of peer nodes organized in a ring topology based on a Distributed Hash Table (DHT) where each node provides an entry point that enables clients outside the network to query the knowledge base using atomic, disjunctive, and conjunc- tive SPARQL queries. The consistency of the results is increased using a data redundancy algorithm that replicates each RDF triple in multiple nodes so that, in the case of peer failure, other peers can retrieve the data needed to solve the queries. Additionally a load distribution algorithm is used to maintain a uniform distribution of the data among the participating peers by dynamically changing the key space assigned to each node in the DHT. The performance of this approach is then evaluated by monitoring the effectiveness of the load balancing and redundancy algorithm and the overhead introduced on the network load in both a static (only join events) and a dynamic scenario.

[1]  Alireza Tahbaz-Salehi,et al.  Small world phenomenon, rapidly mixing Markov chains, and average consensus algorithms , 2007, 2007 46th IEEE Conference on Decision and Control.

[2]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .

[3]  Srinivasan Seshan,et al.  Mercury: supporting scalable multi-attribute range queries , 2004, SIGCOMM '04.

[4]  Richard M. Murray,et al.  Consensus problems in networks of agents with switching topology and time-delays , 2004, IEEE Transactions on Automatic Control.

[5]  J. Carroll,et al.  Jena: implementing the semantic web recommendations , 2004, WWW Alt. '04.

[6]  Jon M. Kleinberg,et al.  The small-world phenomenon: an algorithmic perspective , 2000, STOC '00.

[7]  Min Cai,et al.  MAAN: A Multi-Attribute Addressable Network for Grid Information Services , 2003, Journal of Grid Computing.

[8]  Min Cai,et al.  RDFPeers: a scalable distributed RDF repository based on a structured peer-to-peer network , 2004, WWW '04.

[9]  Ion Stoica,et al.  Non-Transitive Connectivity and DHTs , 2005, WORLDS.

[10]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[11]  Wolfgang Nejdl,et al.  Super-peer-based routing and clustering strategies for RDF-based peer-to-peer networks , 2003, WWW '03.

[12]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[13]  Gerhard Weikum,et al.  Scalable join processing on very large RDF graphs , 2009, SIGMOD Conference.

[14]  Manolis Koubarakis,et al.  Evaluating Conjunctive Triple Pattern Queries over Large Structured Overlay Networks , 2006, SEMWEB.

[15]  Gade Krishna,et al.  A scalable peer-to-peer lookup protocol for Internet applications , 2012 .

[16]  Dave Reynolds,et al.  SPARQL basic graph pattern optimization using selectivity estimation , 2008, WWW.

[17]  Gurmeet Singh Manku,et al.  Symphony: Distributed Hashing in a Small World , 2003, USENIX Symposium on Internet Technologies and Systems.

[18]  Stephen P. Boyd,et al.  A space-time diffusion scheme for peer-to-peer least-squares estimation , 2006, 2006 5th International Conference on Information Processing in Sensor Networks.

[19]  Frank van Harmelen,et al.  Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema , 2002, SEMWEB.

[20]  Long Wang,et al.  Finite-Time Consensus Problems for Networks of Dynamic Agents , 2007, IEEE Transactions on Automatic Control.

[21]  Tore Risch,et al.  EDUTELLA: a P2P networking infrastructure based on RDF , 2002, WWW.

[22]  Olaf Hartig,et al.  The SPARQL Query Graph Model for Query Optimization , 2007, ESWC.

[23]  John S. Baras,et al.  Consensus Problems on Small World Graphs: A Structural Study , 2010 .