Optimising Coverage, Freshness and Diversity in Live Exploration-based Linked Data Queries

Centralised indexes and distributed query federation-based approaches towards executing queries over distributed Linked Open Data are currently limited when it comes to providing complete coverage and up-to-date results. However, live exploration-based query execution, in accordance with the Linked Open Data publishing principles, dereferences Internationalised Resource Identifiers (IRI)s on the fly in order to provide results from Linked Data anywhere on the Web. We propose and investigate similarity search-based strategies for dereferencing IRIs during live exploration-based querying in order to maximise user criteria of coverage, freshness and diversity within a limited execution time, in contrast to existing approaches which may provide complete results but within response times that are too high to be useful within many practical applications. Results are presented from a set of sample queries comparing the IRI selection strategies with existing approaches showing that coverage, freshness and diversity can be improved by up to 30%.

[1]  Jürgen Umbrich,et al.  Freshening up while Staying Fast: Towards Hybrid SPARQL Queries , 2012, EKAW.

[2]  Wolfgang Nejdl,et al.  Combining a Co-occurrence-Based and a Semantic Measure for Entity Linking , 2013, ESWC.

[3]  Olaf Hartig,et al.  Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution , 2011, ESWC.

[4]  Inder Jeet Taneja,et al.  On Generalized Information Measures and Their Applications , 1989 .

[5]  Günter Ladwig,et al.  FedBench: A Benchmark Suite for Federated Semantic Data Query Processing , 2011, SEMWEB.

[6]  Maribel Acosta,et al.  ANAPSID: An Adaptive Query Processing Engine for SPARQL Endpoints , 2011, SEMWEB.

[7]  E. Prud hommeaux,et al.  SPARQL query language for RDF , 2011 .

[8]  Jürgen Umbrich,et al.  Searching and browsing Linked Data with SWSE: The Semantic Web Search Engine , 2011, J. Web Semant..

[9]  Steffen Stadtmüller,et al.  On the Diversity and Availability of Temporal Information in Linked Open Data , 2012, SEMWEB.

[10]  Claudio Gutiérrez,et al.  NautiLOD: A Formal Language for the Web of Data Graph , 2015, TWEB.

[11]  Monika Henzinger,et al.  Purely URL-based topic classification , 2009, WWW '09.

[12]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.

[13]  Jeremy J. Carroll,et al.  Resource description framework (rdf) concepts and abstract syntax , 2003 .

[14]  Koichi Takeda,et al.  Information retrieval on the web , 2000, CSUR.

[15]  Günter Ladwig,et al.  SIHJoin: Querying Remote and Local Linked Data , 2011, ESWC.

[16]  Axel-Cyrille Ngonga Ngomo,et al.  HELIOS - Execution Optimization for Link Discovery , 2014, SEMWEB.

[17]  Olaf Hartig,et al.  Foundations of traversal based query execution over linked data , 2011, HT '12.

[18]  Jürgen Umbrich,et al.  Improving the Recall of Live Linked Data Querying through Reasoning , 2012, RR.

[19]  Günter Ladwig,et al.  Linked Data Query Processing Strategies , 2010, SEMWEB.

[20]  Christian Bizer,et al.  Executing SPARQL Queries over the Web of Linked Data , 2009, SEMWEB.

[21]  Isao Kojima,et al.  A Hybrid Approach to Linked Data Query Processing with Time Constraints , 2013, LDOW.