SMART-KG: Hybrid Shipping for SPARQL Querying on the Web

While Linked Data (LD) provides standards for publishing (RDF) and (SPARQL) querying Knowledge Graphs (KGs) on the Web, serving, accessing and processing such open, decentralized KGs is often practically impossible, as query timeouts on publicly available SPARQL endpoints show. Alternative solutions such as Triple Pattern Fragments (TPF) attempt to tackle the problem of availability by pushing query processing workload to the client side, but suffer from unnecessary transfer of irrelevant data on complex queries with large intermediate results. In this paper we present smart-KG, a novel approach to share the load between servers and clients, while significantly reducing data transfer volume, by combining TPF with shipping compressed KG partitions. Our evaluations show that smart-KG outperforms state-of-the-art client-side solutions and increases server-side availability towards more cost-effective and balanced hosting of open and decentralized KGs.

[1]  Alberto O. Mendelzon,et al.  Foundations of semantic web databases , 2004, PODS.

[2]  Miguel A. Martínez-Prieto,et al.  Exchange and Consumption of Huge RDF Data , 2012, ESWC.

[3]  Mark A. Musen,et al.  Using SPARQL to Query BioPortal Ontologies and Metadata , 2012, SEMWEB.

[4]  Jürgen Umbrich,et al.  SPARQLES: Monitoring public SPARQL endpoints , 2017, Semantic Web.

[5]  Guido Moerkotte,et al.  Characteristic sets: Accurate cardinality estimation for RDF queries with multiple joins , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[6]  Anja Jentzsch Linked Open Data Cloud , 2014 .

[7]  Muhammad Saleem,et al.  LSQ: The Linked SPARQL Queries Dataset , 2015, SEMWEB.

[8]  Ruben Verborgh,et al.  Triple Pattern Fragments: A low-cost knowledge graph interface for the Web , 2016, J. Web Semant..

[9]  Nieves R. Brisaboa,et al.  Practical compressed string dictionaries , 2016, Inf. Syst..

[10]  Pablo de la Fuente,et al.  Characterising RDF data sets , 2018, J. Inf. Sci..

[11]  Miguel A. Martínez-Prieto,et al.  Serializing RDF in Compressed Space , 2015, 2015 Data Compression Conference.

[12]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[13]  Werner Almesberger,et al.  Linux Network Traffic Control -- Implementation Overview , 1999 .

[14]  Stefan Decker,et al.  Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web (Dagstuhl Seminar 18371) , 2019, Dagstuhl Reports.

[15]  Axel Polleres,et al.  A More Decentralized Vision for Linked Data , 2020, DeSemWeb@ISWC.

[16]  Jürgen Umbrich,et al.  Strategies for Executing Federated Queries in SPARQL1.1 , 2014, SEMWEB.

[17]  M. Tamer Özsu,et al.  Diversified Stress Testing of RDF Data Management Systems , 2014, SEMWEB.

[18]  Olaf Hartig,et al.  A Context-Based Semantics for SPARQL Property Paths over the Web (Extended Version) , 2015, ESWC.

[19]  Frank van Harmelen,et al.  Finding the Achilles Heel of the Web of Data: Using Network Analysis for Link-Recommendation , 2010, SEMWEB.

[20]  Hala Skaf-Molli,et al.  SaGe: Web Preemption for Public SPARQL Query Services , 2019, WWW.

[21]  Wim Martens,et al.  An Analytical Study of Large SPARQL Query Logs , 2017, Proc. VLDB Endow..

[22]  Stefan Decker,et al.  Linked Biomedical Dataspace: Lessons Learned Integrating Data for Drug Discovery , 2014, SEMWEB.

[23]  Orri Erling,et al.  RDF Support in the Virtuoso DBMS , 2007, CSSW.

[24]  Olaf Hartig,et al.  Bindings-Restricted Triple Pattern Fragments , 2016, OTM Conferences.

[25]  Thomas Neumann,et al.  Exploiting the query structure for efficient join ordering in SPARQL queries , 2014, EDBT.

[26]  Christian Bizer,et al.  Executing SPARQL Queries over the Web of Linked Data , 2009, SEMWEB.

[27]  Maribel Acosta,et al.  Querying Large Knowledge Graphs over Triple Pattern Fragments: An Empirical Study , 2018, International Semantic Web Conference.

[28]  Maribel Acosta,et al.  Networks of Linked Data Eddies: An Adaptive Web Query Processing Engine for RDF Data , 2015, SEMWEB.

[29]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[30]  Axel Polleres,et al.  Binary RDF representation for publication and exchange (HDT) , 2013, J. Web Semant..

[31]  Björn Þór Jónsson,et al.  Performance tradeoffs for client-server query processing , 1996, SIGMOD '96.

[32]  Jürgen Umbrich,et al.  SPARQL Web-Querying Infrastructure: Ready for Action? , 2013, SEMWEB.

[33]  Nikos Mamoulis,et al.  Extended Characteristic Sets: Graph Indexing for SPARQL Query Optimization , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[34]  Olaf Hartig,et al.  SQUIN: a traversal based query execution system for the web of linked data , 2013, SIGMOD '13.