SPARQL with property paths on the Web

Linked Data on the Web represents an immense source of knowledge suitable to be automatically processed and queried. In this respect, there are different approaches for Linked Data querying that differ on the degree of centralization adopted. On one hand, the SPARQL query language, originally defined for querying single datasets, has been enhanced with features to query federations of datasets; however, this attempt is not sufficient to cope with the distributed nature of data sources available as Linked Data. On the other hand, extensions or variations of SPARQL aim to find trade-offs between centralized and fully distributed querying. The idea is to partially move the computational load from the servers to the clients. Despite the variety and the relative merits of these approaches, as of today, there is no standard language for querying Linked Data on theWeb. A specific requirement for such a language to capture the distributed, graph-like nature of Linked Data sources on the Web is a support of graph navigation. Recently, SPARQL has been extended with a navigational feature called property paths (PPs). However, the semantics of SPARQL restricts the scope of navigation via PPs to single RDF graphs. This restriction limits the applicability of PPs for querying distributed Linked Data sources on the Web. To fill this gap, in this paper we provide formal foundations for evaluating PPs on the Web, thus contributing to the definition of a query language for Linked Data. We first introduce a family of reachability-based query semantics for PPs that distinguish between navigation on the Web and navigation at the data level. Thereafter, we consider another, alternative query semantics that couples Web graph navigation and data level navigation; we call it context-based semantics. Given these semantics, we find that for some PP-based SPARQL queries a complete evaluation on the Web is not possible. To study this phenomenon we introduce a notion of Web-safeness of queries, and prove a decidable syntactic property that enables systems to identify queries that areWeb-safe. In addition to establishing these formal foundations, we conducted an experimental comparison of the context-based semantics and a reachability- based semantics. Our experiments show that when evaluating a PP-based query under the context-based semantics one experiences a significantly smaller number of dereferencing operations, but the computed query result may contain less solutions.

[1]  Jean-François Baget,et al.  Extending SPARQL with regular expression patterns (for querying RDF) , 2009, J. Web Semant..

[2]  Alberto O. Mendelzon,et al.  Database techniques for the World-Wide Web: a survey , 1998, SGMD.

[3]  Jürgen Umbrich,et al.  Link traversal querying for a diverse Web of Data , 2014, Semantic Web.

[4]  Serge Abiteboul,et al.  Queries and computation on the web , 1997, Theor. Comput. Sci..

[5]  David Hawking,et al.  Focused crawling for both topical relevance and quality of medical information , 2005, CIKM '05.

[6]  Roi Blanco,et al.  Focused Crawling for Structured Data , 2014, CIKM.

[7]  Claudio Gutiérrez,et al.  NautiLOD: A Formal Language for the Web of Data Graph , 2015, TWEB.

[8]  Óscar Corcho,et al.  Federating queries in SPARQL 1.1: Syntax, semantics and evaluation , 2013, J. Web Semant..

[9]  Michael Schmidt,et al.  Foundations of SPARQL query optimization , 2008, ICDT '10.

[10]  Luciano Serafini,et al.  Querying the Web of Data: A Formal Approach , 2009, ASWC.

[11]  Alberto O. Mendelzon,et al.  Querying the World Wide Web , 1997, International Journal on Digital Libraries.

[12]  Christian Bizer,et al.  Executing SPARQL Queries over the Web of Linked Data , 2009, SEMWEB.

[13]  Egor V. Kostylev,et al.  SPARQL with Property Paths , 2015, SEMWEB.

[14]  Rik Van de Walle,et al.  Querying Datasets on the Web with High Availability , 2014, SEMWEB.

[15]  Peter T. Wood,et al.  Query languages for graph databases , 2012, SGMD.

[16]  Jennifer Golbeck,et al.  Linking Social Networks on the Web with FOAF: A Semantic Web Case Study , 2008, AAAI.

[17]  David Konopnicki,et al.  Information gathering in the World-Wide Web: the W3QL query language and the W3QS system , 1998, TODS.

[18]  Marcelo Arenas,et al.  Semantics and complexity of SPARQL , 2006, TODS.

[19]  Ruben Verborgh,et al.  Triple Pattern Fragments: A low-cost knowledge graph interface for the Web , 2016, J. Web Semant..

[20]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[21]  Claudio Gutiérrez,et al.  Semantic navigation on the web of data: specification of routes, web fragments and actions , 2011, WWW.

[22]  Krys J. Kochut,et al.  SPARQLeR: Extended Sparql for Semantic Association Discovery , 2007, ESWC.

[23]  Sebastian Speiser,et al.  On Completeness Classes for Query Evaluation on Linked Data , 2012, AAAI.

[24]  David Toman,et al.  Fundamentals of Physical Design and Query Compilation , 2011, Fundamentals of Physical Design and Query Compilation.

[25]  Marcelo Arenas,et al.  Counting beyond a Yottabyte, or how SPARQL 1.1 property paths will prevent adoption of the standard , 2012, WWW.

[26]  Craig A. Knoblock,et al.  Using a Knowledge Graph to Combat Human Trafficking , 2015, SEMWEB.

[27]  Sebastian Schaffert,et al.  The linked media framework: integrating and interlinking enterprise media content and data , 2012, I-SEMANTICS '12.

[28]  Roy T. Fielding,et al.  Hypertext Transfer Protocol - HTTP/1.1 , 1997, RFC.

[29]  Jorge Pérez,et al.  LDQL: A Query Language for the Web of Linked Data (Extended Version) , 2016, J. Web Semant..

[30]  Mariano P. Consens,et al.  Extended Property Paths: Writing More SPARQL Queries in a Succinct Way , 2015, AAAI.

[31]  Juan L. Reutter,et al.  Recursion in SPARQL , 2015, SEMWEB.

[32]  Wim Martens,et al.  The complexity of evaluating path expressions in SPARQL , 2012, PODS '12.

[33]  Marcelo Arenas,et al.  nSPARQL: A navigational language for RDF , 2010, J. Web Semant..

[34]  Olaf Hartig,et al.  How Caching Improves Efficiency and Result Completeness for Querying Linked Data , 2011, LDOW.

[35]  Olaf Hartig,et al.  SPARQL for a Web of Linked Data: Semantics and Computability (Extended Version) , 2012, ESWC.

[36]  Olaf Hartig,et al.  A Context-Based Semantics for SPARQL Property Paths over the Web (Extended Version) , 2015, ESWC.

[37]  Jorge Pérez,et al.  Static analysis and optimization of semantic web queries , 2012, PODS '12.