Modular P2P-Based Approach for RDF Data Storage and Retrieval

One of the key elements of the Semantic Web is the Resource Description Framework (RDF). Efficient storage and retrieval of RDF data in large scale settings is still challenging and existing solutions are monolithic and thus not very flexible from a software engineering point of view. In this paper, we propose a modular system, based on the scalable Content-Addressable Network (CAN), which gives the possibility to store and retrieve RDF data in large scale settings. We identified and isolated key components forming such system in our design architecture. We have evaluated our system using the Grid'5000 testbed over 300 peers on 75 machines and the outcome of these micro-benchmarks show interesting results in terms of scalability and concurrent queries.

[1]  J. Carroll,et al.  Jena: implementing the semantic web recommendations , 2004, WWW Alt. '04.

[2]  Frank van Harmelen,et al.  Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema , 2002, SEMWEB.

[3]  Min Cai,et al.  MAAN: A Multi-Attribute Addressable Network for Grid Information Services , 2003, Journal of Grid Computing.

[4]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[5]  Min Cai,et al.  RDFPeers: a scalable distributed RDF repository based on a structured peer-to-peer network , 2004, WWW '04.

[6]  Karl Aberer,et al.  P-Grid: a self-organizing structured P2P system , 2003, SGMD.

[7]  Françoise Baude,et al.  A Survey of Structured P2P Systems for RDF Data Storage and Retrieval , 2011, Trans. Large Scale Data Knowl. Centered Syst..

[8]  Florian Schintke,et al.  Structured Overlay without Consistent Hashing: Empirical Results , 2006 .

[9]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM 2001.

[10]  Atanas Kiryakov,et al.  OWLIM - A Pragmatic Semantic Repository for OWL , 2005, WISE Workshops.

[11]  Seif Haridi,et al.  Efficient Broadcast in Structured P2P Networks , 2003, IPTPS.

[12]  Tore Risch,et al.  EDUTELLA: a P2P networking infrastructure based on RDF , 2002, WWW.

[13]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .

[14]  Christian Bizer,et al.  The Berlin SPARQL Benchmark , 2009, Int. J. Semantic Web Inf. Syst..

[15]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[16]  Karl Aberer,et al.  GridVine: Building Internet-Scale Semantic Overlay Networks , 2004, SEMWEB.

[17]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[18]  Said Mirza Pahlevi,et al.  RDFCube: A P2P-Based Three-Dimensional Index for Structural Joins on Distributed Triple Stores , 2005, DBISP2P.

[19]  Sriram Ramabhadran,et al.  Prefix Hash Tree An Indexing Data Structure over Distributed Hash Tables , 2004, PODC 2004.