A Hybrid Approach to Linked Data Query Processing with Time Constraints

In addition to RDF data within documents published according to the Linked Data principles, SPARQL endpoints are also a potential source of a great deal of Linked Data. The execution of queries using languages such as SPARQL can use utilise both of these types of data sources. In this paper we present a hybrid approach to answering SPARQL queries that makes use of both link traversal-based and distributed query processing-based approaches in order to combine query answering over the Web of Linked Data and SPARQL endpoints respectively. The technique diers from existing work in that link traversal and endpoint queries take place in parallel without a static query plan. It is demonstrated how, using a set of heuristics and optimisation techniques, this can be eective when answering queries with time constraints (incomplete answers are acceptable in order to minimise execution time). An evaluation of the technique is presented using the FedBench Linked Data queries with query execution time limited to 10 seconds, with an analysis of answers that can be provided within this time limit.