Partitioning RDF exploiting workload information

One approach to leverage scalable systems for RDF management is partitioning large datasets across distributed servers. In this paper we consider workload data, given in the form of query patterns and their frequencies, for determining how to partition RDF datasets. Our experimental study shows that our workload-aware method is an effective way to cluster related data and provides better query response times compared to an elementary fragmentation method.

[1]  Min Cai,et al.  RDFPeers: a scalable distributed RDF repository based on a structured peer-to-peer network , 2004, WWW '04.

[2]  Carmem S. Hara,et al.  Affinity­based XML Fragmentation , 2012, WebDB.

[3]  Tom Heath,et al.  Linked Data: Evolving the Web into a Global Data Space , 2011, Linked Data.

[4]  Tim Berners-Lee,et al.  Linked data , 2020, Semantic Web for the Working Ontologist.

[5]  Christian Bizer,et al.  The Berlin SPARQL Benchmark , 2009, Int. J. Semantic Web Inf. Syst..

[6]  Christian Bizer,et al.  Evolving the Web into a Global Data Space , 2011, BNCOD.

[7]  Katja Hose,et al.  Partout: a distributed engine for efficient RDF processing , 2012, WWW.