Optimised Maintenance of Datalog Materialisations

To efficiently answer queries, datalog systems often materialise all consequences of a datalog program, so the materialisation must be updated whenever the input facts change. Several solutions to the materialisation update problem have been proposed. The Delete/Rederive (DRed) and the Backward/Forward (B/F) algorithms solve this problem for general datalog, but both contain steps that evaluate rules 'backwards' by matching their heads to a fact and evaluating the partially instantiated rule bodies as queries. We show that this can be a considerable source of overhead even on very small updates. In contrast, the Counting algorithm does not evaluate the rules 'backwards', but it can handle only nonrecursive rules. We present two hybrid approaches that combine DRed and B/F with Counting so as to reduce or even eliminate 'backward' rule evaluation while still handling arbitrary datalog programs. We show empirically that our hybrid algorithms are usually significantly faster than existing approaches, sometimes by orders of magnitude.

[1]  Frank van Harmelen,et al.  DynamiTE: Parallel Materialization of Dynamic RDF Data , 2013, SEMWEB.

[2]  V. S. Subrahmanian,et al.  Maintaining views incrementally , 1993, SIGMOD Conference.

[3]  Jean-Marie Nicolas,et al.  An Outline of BDGEN: A Deductive DBMS , 1983, IFIP Congress.

[4]  Yavor Nenov,et al.  Incremental Update of Datalog Materialisation: the Backward/Forward Algorithm , 2015, AAAI.

[5]  Yavor Nenov,et al.  Datalog rewritability of Disjunctive Datalog programs and non-Horn ontologies , 2016, Artif. Intell..

[6]  Jeff Z. Pan,et al.  Optimising ontology stream reasoning with truth maintenance system , 2011, CIKM '11.

[7]  Zhe Wu,et al.  Implementing an Inference Engine for RDFS/OWL Constructs and User-Defined Rules in Oracle , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[8]  Jeff Heflin,et al.  LUBM: A benchmark for OWL knowledge base systems , 2005, J. Web Semant..

[9]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[10]  Ian Horrocks,et al.  Making the most of your triple store: query answering in OWL 2 using an RL reasoner , 2013, WWW.

[11]  Barry Bishop,et al.  OWLIM: A family of scalable semantic repositories , 2011, Semantic Web.

[12]  B. Motik,et al.  RDFox: A Highly-Scalable RDF Store , 2015, SEMWEB.

[13]  Georg Gottlob,et al.  Complexity and expressive power of logic programming , 2001, CSUR.

[14]  Jacopo Urbani,et al.  Column-Oriented Datalog Materialization for Large Knowledge Graphs , 2016, AAAI.

[15]  Bjørnar Luteberget,et al.  Rule-Based Consistency Checking of Railway Infrastructure Designs , 2016, IFM.

[16]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[17]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[18]  H. Lan,et al.  SWRL : A semantic Web rule language combining OWL and ruleML , 2004 .

[19]  Frank van Harmelen,et al.  WebPIE: A Web-scale Parallel Inference Engine using MapReduce , 2012, J. Web Semant..

[20]  Li Ma,et al.  Towards a Complete OWL Ontology Benchmark , 2006, ESWC.

[21]  Matthias Jarke,et al.  Incremental Maintenance of Externally Materialized Views , 1996, VLDB.

[22]  Boris Motik,et al.  OWL 2 Web Ontology Language: structural specification and functional-style syntax , 2008 .

[23]  Volker Markl,et al.  Breaking the Chains: On Declarative Data Analysis and Data Independence in the Big Data Era , 2014, Proc. VLDB Endow..

[24]  Yavor Nenov,et al.  Semantic Technologies for Data Analysis in Health Care , 2016, SEMWEB.