Towards Empty Answers in SPARQL: Approximating Querying with RDF Embedding

The LOD cloud offers a plethora of RDF data sources where users discover items of interest by issuing SPARQL queries. A common query problem for users is to face with empty answers: given a SPARQL query that returns nothing, how to refine the query to obtain a non-empty set? In this paper, we propose an RDF graph embedding based framework to solve the SPARQL empty-answer problem in terms of a continuous vector space. We first project the RDF graph into a continuous vector space by an entity context preserving translational embedding model which is specially designed for SPARQL queries. Then, given a SPARQL query that returns an empty set, we partition it into several parts and compute approximate answers by leveraging RDF embeddings and the translation mechanism. We also generate alternative queries for returned answers, which helps users recognize their expectations and refine the original query finally. To validate the effectiveness and efficiency of our framework, we conduct extensive experiments on the real-world RDF dataset. The results show that our framework can significantly improve the quality of approximate answers and speed up the generation of alternative queries.

[1]  Steffen Staab,et al.  SPLODGE: Systematic Generation of SPARQL Benchmark Queries for Linked Open Data , 2012, SEMWEB.

[2]  Alexandra Poulovassilis,et al.  Query Relaxation in RDF , 2008, J. Data Semant..

[3]  Mickaël Baron,et al.  Cooperative Techniques for SPARQL Query Relaxation in RDF Databases , 2015, ESWC.

[4]  Chengfei Liu,et al.  Approximating query answering on RDF databases , 2011, World Wide Web.

[5]  Mickaël Baron,et al.  Handling failing RDF queries: from diagnosis to relaxation , 2016, Knowledge and Information Systems.

[6]  Heiko Paulheim,et al.  Global RDF Vector Space Embeddings , 2017, SEMWEB.

[7]  Wim Martens,et al.  An analytical study of large SPARQL query logs , 2017, VLDB 2017.

[8]  Gerhard Weikum,et al.  Language-model-based ranking for queries on RDF-graphs , 2009, CIKM.

[9]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[10]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[11]  Gavin Powell,et al.  Towards Fuzzy Query-Relaxation for RDF , 2012, ESWC.

[12]  Gerhard Weikum,et al.  Query Relaxation for Entity-Relationship Search , 2011, ESWC.

[13]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[14]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[15]  Peter Dolog,et al.  Relaxing RDF queries based on user and domain preferences , 2009, Journal of Intelligent Information Systems.

[16]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[17]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[18]  Heiko Paulheim,et al.  RDF2Vec: RDF Graph Embeddings for Data Mining , 2016, SEMWEB.

[19]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[20]  Alexandra Poulovassilis,et al.  Combining Approximation and Relaxation in Semantic Web Path Queries , 2010, SEMWEB.