Optimizing Reformulation-based Query Answering in RDF

Reformulation-based query answering is a query processing technique aiming at answering queries under constraints. It consists of reformulating the query based on the constraints, so that evaluating the reformulated query directly against the data (i.e., without considering any more the constraints) produces the correct answer set. In this paper, we consider optimizing reformulation-based query answering in the setting of ontology-based data access, where SPARQL conjunctive queries are posed against RDF facts on which constraints expressed by an RDF Schema hold. The literature provides query reformulation algorithms for many fragments of RDF. However, reformulated queries may be complex, thus may not be eciently processed by a query engine; well established query engines even fail processing them in some cases. Our contribution is (i) to generalize prior query reformulation languages, leading to investigating a space of reformulated queries we call JUCQs (joins of unions of conjunctive queries), instead of a single reformulation; and (ii) an effective and ecient cost-based algorithm for selecting from this space, the reformulated query with the lowest estimated cost. Our experiments show that our technique enables reformulation-based query answering where the state-of-theart approaches are simply unfeasible, while it may decrease its cost by orders of magnitude in other cases.

[1]  Frank van Harmelen,et al.  WebPIE: A Web-scale Parallel Inference Engine using MapReduce , 2012, J. Web Semant..

[2]  Riccardo Rosati,et al.  Improving Query Answering over DL-Lite Ontologies , 2010, KR.

[3]  Jeff Heflin,et al.  LUBM: A benchmark for OWL knowledge base systems , 2005, J. Web Semant..

[4]  Jeff Z. Pan,et al.  Resource Description Framework , 2020, Definitions.

[5]  Diego Calvanese,et al.  Tractable Reasoning and Efficient Query Answering in Description Logics: The DL-Lite Family , 2007, Journal of Automated Reasoning.

[6]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[7]  Ioana Manolescu,et al.  Web Data Management , 2011 .

[8]  Marcelo Arenas,et al.  Foundations of RDF Databases , 2008, Reasoning Web.

[9]  François Goasdoué,et al.  Efficient query answering against dynamic RDF databases , 2013, EDBT '13.

[10]  Daniel J. Abadi,et al.  Scalable Semantic Web Data Management Using Vertical Partitioning , 2007, VLDB.

[11]  Giorgio Orsi,et al.  Ontological queries: Rewriting and optimization , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[12]  Frank van Harmelen,et al.  DynamiTE: Parallel Materialization of Dynamic RDF Data , 2013, SEMWEB.

[13]  Frank van Harmelen,et al.  QueryPIE: Backward Reasoning for OWL Horst over Very Large Knowledge Bases , 2011, SEMWEB.

[14]  Maurizio Lenzerini,et al.  MASTRO: A Reasoner for Effective Ontology-Based Data Access , 2012, ORE.

[15]  Jan Hidders,et al.  A Structural Approach to Indexing Triples , 2012, ESWC.

[16]  Gerhard Weikum,et al.  The RDF-3X engine for scalable management of RDF data , 2010, The VLDB Journal.

[17]  Patrick Valduriez,et al.  Principles of Distributed Database Systems, Third Edition , 2011 .

[18]  Carl G. Wagner,et al.  Minimal covers of finite sets , 1973, Discret. Math..

[19]  Manolis Koubarakis,et al.  RDFS Reasoning and Query Answering on Top of DHTs , 2008, SEMWEB.

[20]  Michaël Thomazo,et al.  Compact Rewriting for Existential Rules , 2017 .

[21]  Michaël Thomazo Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Compact Rewritings for Existential Rules ∗ , 2022 .

[22]  Alfons Kemper,et al.  Integrating semi-join-reducers into state-of-the-art query processors , 2001, Proceedings 17th International Conference on Data Engineering.

[23]  François Goasdoué,et al.  View Selection in Semantic Web Databases , 2011, Proc. VLDB Endow..

[24]  Barry Bishop,et al.  OWLIM: A family of scalable semantic repositories , 2011, Semantic Web.

[25]  Tomasz Imielinski,et al.  Incomplete Information in Relational Databases , 1984, JACM.

[26]  John Julian Carstens,et al.  SPARQL Protocol And RDF Query Language , 2012 .

[27]  Arjohn Kampman,et al.  Inferencing and Truth Maintenance in RDF Schema , 2003, PSSS.

[28]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .

[29]  Dave Reynolds,et al.  SPARQL basic graph pattern optimization using selectivity estimation , 2008, WWW.

[30]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[31]  Abraham Bernstein,et al.  Hexastore: sextuple indexing for semantic web data management , 2008, Proc. VLDB Endow..