SaGe: Web Preemption for Public SPARQL Query Services

To provide stable and responsive public SPARQL query services, data providers enforce quotas on server usage. Queries which exceed these quotas are interrupted and deliver partial results. Such interruption is not an issue if it is possible to resume queries execution afterward. Unfortunately, there is no preemption model for the Web that allows for suspending and resuming SPARQL queries. In this paper, we propose SaGe: a SPARQL query engine based on Web preemption. SaGe allows SPARQL queries to be suspended by the Web server after a fixed time quantum and resumed upon client request. Web preemption is tractable only if its cost in time is negligible compared to the time quantum. The challenge is to support the full SPARQL query language while keeping the cost of preemption negligible. Experimental results demonstrate that SaGe outperforms existing SPARQL query processing approaches by several orders of magnitude in term of the average total query execution time and the time for first results.

[1]  Garcia Molina,et al.  Database Systems The Complete Book, -2/E , 2020 .

[2]  Christian Bizer,et al.  Executing SPARQL Queries over the Web of Linked Data , 2009, SEMWEB.

[3]  Maribel Acosta,et al.  Querying Large Knowledge Graphs over Triple Pattern Fragments: An Empirical Study , 2018, International Semantic Web Conference.

[4]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[5]  Orri Erling,et al.  RDF Support in the Virtuoso DBMS , 2007, CSSW.

[6]  Jim Gray,et al.  The convoy phenomenon , 1979, OPSR.

[7]  Olaf Hartig,et al.  Bindings-Restricted Triple Pattern Fragments , 2016, OTM Conferences.

[8]  Abraham Bernstein,et al.  Hexastore: sextuple indexing for semantic web data management , 2008, Proc. VLDB Endow..

[9]  Axel Polleres,et al.  Binary RDF representation for publication and exchange (HDT) , 2013, J. Web Semant..

[10]  Marcelo Arenas,et al.  Semantics and complexity of SPARQL , 2006, TODS.

[11]  Leonard Kleinrock,et al.  Analysis of A time‐shared processor , 1964 .

[12]  Dennis W. Fife R68-47 Computer Scheduling Methods and Their Countermeasures , 1968, IEEE Transactions on Computers.

[13]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[14]  Ruben Verborgh,et al.  Triple Pattern Fragments: A low-cost knowledge graph interface for the Web , 2016, J. Web Semant..

[15]  Gerhard Weikum,et al.  The RDF-3X engine for scalable management of RDF data , 2010, The VLDB Journal.

[16]  Muhammad Saleem,et al.  FEASIBLE: A Feature-Based SPARQL Benchmark Generation Framework , 2015, SEMWEB.

[17]  Jorge Pérez,et al.  A Formal Framework for Comparing Linked Data Fragments , 2017, SEMWEB.

[18]  Axel Polleres,et al.  A More Decentralized Vision for Linked Data , 2020, DeSemWeb@ISWC.

[19]  Goetz Graefe,et al.  The Volcano optimizer generator: extensibility and efficient search , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[20]  Thomas Anderson,et al.  Operating Systems: Principles and Practice , 2012 .

[21]  Jürgen Umbrich,et al.  Strategies for Executing Federated Queries in SPARQL1.1 , 2014, SEMWEB.

[22]  M. Tamer Özsu,et al.  Diversified Stress Testing of RDF Data Management Systems , 2014, SEMWEB.

[23]  A. N. Wilschut,et al.  Dataflow query execution in a parallel main-memory environment , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[24]  Jürgen Umbrich,et al.  SPARQL Web-Querying Infrastructure: Ready for Action? , 2013, SEMWEB.

[25]  Jennifer Widom,et al.  Database Systems: The Complete Book , 2001 .

[26]  Michael Schmidt,et al.  Foundations of SPARQL query optimization , 2008, ICDT '10.

[27]  Heiko Paulheim,et al.  Adoption of the Linked Data Best Practices in Different Topical Domains , 2014, SEMWEB.

[28]  Katja Hose,et al.  FedX: Optimization Techniques for Federated Query Processing on Linked Data , 2011, SEMWEB.

[29]  Laura M. Haas,et al.  Optimizing Queries Across Diverse Data Sources , 1997, VLDB.

[30]  Douglas Comer,et al.  Ubiquitous B-Tree , 1979, CSUR.