Web-Scale Querying through Linked Data Fragments

To unlock the full potential of Linked Data sources, we need flexible ways to query them. Public sparql endpoints aim to fulfill that need, but their availability is notoriously problematic. We therefore introduce Linked Data Fragments, a publishing method that allows e cient o oading of query execution from servers to clients through a lightweight partitioning strategy. It enables servers to maintain availability rates as high as any regular http server, allowing querying to scale reliably to much larger numbers of clients. This paper explains the core concepts behind Linked Data Fragments and experimentally verifies their Web-level scalability, at the cost of increased query times. We show how trading server-side query execution for inexpensive data resources with relevant a ordances enables a new generation of intelligent clients.

[1]  Michael Martin,et al.  Improving the Performance of Semantic Web Applications with SPARQL Query Caching , 2010, ESWC.

[2]  R. Fielding,et al.  Architectural Styles and the Design of Network-based Software Architectures (CHAPTER 5) , 2000 .

[3]  Christian Bizer,et al.  The Berlin SPARQL Benchmark , 2009, Int. J. Semantic Web Inf. Syst..

[4]  Christian Bizer,et al.  Executing SPARQL Queries over the Web of Linked Data , 2009, SEMWEB.

[5]  Virgílio A. F. Almeida,et al.  Quantifying the sustainability impact of data center availability , 2010, PERV.

[6]  Olaf Hartig,et al.  Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution , 2011, ESWC.

[7]  Jürgen Umbrich,et al.  SPARQL Web-Querying Infrastructure: Ready for Action? , 2013, SEMWEB.

[8]  Felix Naumann,et al.  Detecting SPARQL Query Templates for Data Prefetching , 2013, ESWC.

[9]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[10]  Olaf Hartig,et al.  An Overview on Execution Strategies for Linked Data Queries , 2013, Datenbank-Spektrum.

[11]  Mark Nottingham,et al.  Feed Paging and Archiving , 2007, RFC.

[12]  Wendy Hall,et al.  The Semantic Web Revisited , 2006, IEEE Intelligent Systems.

[13]  Steve Vinoski Serendipitous Reuse , 2008, IEEE Internet Computing.

[14]  Kerry L. Taylor,et al.  Towards Content-Aware SPARQL Query Caching for Semantic Web Applications , 2013, WISE.

[15]  Moe Key Improving SPARQL query performance with algebraic expression tree based caching and entity caching , 2012 .

[16]  Erik Wilde Hypertext Transfer Protocol (HTTP) , 1999 .

[17]  Olaf Hartig,et al.  How Caching Improves Efficiency and Result Completeness for Querying Linked Data , 2011, LDOW.

[18]  Georg Lausen,et al.  SP^2Bench: A SPARQL Performance Benchmark , 2008, 2009 IEEE 25th International Conference on Data Engineering.

[19]  Jesse Weaver,et al.  Enabling Fine-Grained HTTP Caching of SPARQL Query Results , 2011, SEMWEB.

[20]  Georg Lausen,et al.  SP2Bench: A SPARQL Performance Benchmark , 2008, Semantic Web Information Management.

[21]  Felix Naumann,et al.  Caching and Prefetching Strategies for SPARQL Queries , 2013, ESWC.

[22]  Axel Polleres,et al.  Binary RDF representation for publication and exchange (HDT) , 2013, J. Web Semant..

[23]  James E. J. Bottomley Implementing clusters for high availability , 2004 .

[24]  Christian Gütl,et al.  Hydra: A Vocabulary for Hypermedia-Driven Web APIs , 2013, LDOW.

[25]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[26]  Kjetil Kjernsmo The necessity of hypermedia RDF and an approach to achieve it , 2012 .

[27]  Cesare Pautasso,et al.  Why is the web loosely coupled?: a multi-faceted metric for service design , 2009, WWW '09.

[28]  Tim Berners-Lee,et al.  Hypertext transfer protocol--http/i , 1993 .

[29]  Martin Thomson,et al.  Hypertext Transfer Protocol Version 2 (HTTP/2) , 2015, RFC.