The Vadalog System: Datalog-based Reasoning for Knowledge Graphs

Over the past years, there has been a resurgence of Datalog-based systems in the database community as well as in industry. In this context, it has been recognized that to handle the complex knowledge-based scenarios encountered today, such as reasoning over large knowledge graphs, Datalog has to be extended with features such as existential quantification. Yet, Datalog-based reasoning in the presence of existential quantification is in general undecidable. Many efforts have been made to define decidable fragments. Warded Datalog+/- is a very promising one, as it captures PTIME complexity while allowing ontological reasoning. Yet so far, no implementation of Warded Datalog+/- was available. In this paper we present the Vadalog system, a Datalog-based system for performing complex logic reasoning tasks, such as those required in advanced knowledge graphs. The Vadalog system is Oxford's contribution to the VADA research programme, a joint effort of the universities of Oxford, Manchester and Edinburgh and around 20 industrial partners. As the main contribution of this paper, we illustrate the first implementation of Warded Datalog+/-, a high-performance Datalog+/- system utilizing an aggressive termination control strategy. We also provide a comprehensive experimental evaluation.

[1]  Sanjeev Khanna,et al.  Edinburgh Research Explorer On the Propagation of Deletions and Annotations through Views , 2013 .

[2]  Emanuel Sallinger,et al.  Reasoning about Schema Mappings , 2013, Data Exchange, Information, and Streams.

[3]  Wolfgang Faber,et al.  The DLV system for knowledge representation and reasoning , 2002, TOCL.

[4]  Paolo Papotti,et al.  IQ-METER - An evaluation tool for data-transformation systems , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[5]  Emanuel Sallinger,et al.  Enhancing the Updatability of Projective Views , 2013, AMW.

[6]  Renée J. Miller,et al.  The iBench Integration Metadata Generator , 2015, Proc. VLDB Endow..

[7]  Béla Bollobás,et al.  Directed scale-free graphs , 2003, SODA '03.

[8]  Yavor Nenov,et al.  Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF Systems , 2014, AAAI.

[9]  Emanuel Sallinger,et al.  Nested dependencies: structure and reasoning , 2014, PODS.

[10]  Carlo Zaniolo,et al.  Optimizing recursive queries with monotonic aggregates in DeALS , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[11]  Ruslan R. Fayzrakhmanov,et al.  OXPath-Based Data Acquisition for dblp , 2017, 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL).

[12]  Georg Gottlob,et al.  Expressiveness of guarded existential rule languages , 2014, PODS.

[13]  Sebastian Rudolph,et al.  Walking the Complexity Lines for Generalized Guarded Existential Rules , 2011, IJCAI.

[14]  Stephan Schulz,et al.  System Description: E 1.8 , 2013, LPAR.

[15]  Jean-François Baget,et al.  Walking the Decidability Line for Rules with Existential Variables , 2010, KR.

[16]  Goetz Graefe,et al.  The Volcano optimizer generator: extensibility and efficient search , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[17]  Angela Bonifati,et al.  Functional Dependencies Unleashed for Scalable Data Exchange , 2016, SSDBM.

[18]  Ronald Fagin,et al.  Schema Mapping Evolution Through Composition and Inversion , 2011, Schema Matching and Mapping.

[19]  Luigi Bellomarini,et al.  Swift Logic for Big Data and Knowledge Graphs - Overview of Requirements, Language, and System , 2017, SOFSEM.

[20]  Letizia Tanca,et al.  Logic Programming and Databases , 1990, Surveys in Computer Science.

[21]  Andrea Calì,et al.  Datalog+/-: A Family of Logical Knowledge Representation and Query Languages for New Applications , 2010, 2010 25th Annual IEEE Symposium on Logic in Computer Science.

[22]  Alin Deutsch,et al.  Datalography: Scaling datalog graph analytics on graph processing systems , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[23]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2003, Theor. Comput. Sci..

[24]  Albert-László Barabási,et al.  Scale-free networks , 2008, Scholarpedia.

[25]  D. Garlaschelli,et al.  The scale-free topology of market investments , 2003, cond-mat/0310503.

[26]  Moshe Y. Vardi,et al.  The Implication Problem for Functional and Inclusion Dependencies is Undecidable , 1985, SIAM J. Comput..

[27]  Emanuel Sallinger,et al.  Combined Complexity of Repair Checking and Consistent Query Answering , 2014, AMW.

[28]  Markus Krötzsch Efficient Rule-Based Inferencing for OWL EL , 2011, IJCAI.

[29]  Tim Furche,et al.  Data Wrangling for Big Data: Towards a Lingua Franca for Data Wrangling , 2016, AMW.

[30]  Michael Benedikt,et al.  Querying with Access Patterns and Integrity Constraints , 2015, Proc. VLDB Endow..

[31]  Markus Krötzsch,et al.  Logic on MARS: Ontologies for Generalised Property Graphs , 2017, IJCAI.

[32]  Emir Pasalic,et al.  Design and Implementation of the LogicBlox System , 2015, SIGMOD Conference.

[33]  Emanuel Sallinger,et al.  Relaxed Notions of Schema Mapping Equivalence Revisited , 2011, ICDT '11.

[34]  Jacopo Urbani,et al.  Column-Oriented Datalog Materialization for Large Knowledge Graphs , 2016, AAAI.

[35]  César A. Hidalgo,et al.  Scale-free networks , 2008, Scholarpedia.

[36]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[37]  Jeff Heflin,et al.  LUBM: A benchmark for OWL knowledge base systems , 2005, J. Web Semant..

[38]  Michael Meier The backchase revisited , 2013, The VLDB Journal.

[39]  Andrea Calì,et al.  Towards more expressive ontology languages: The query answering problem , 2012, Artif. Intell..

[40]  Andrea Calì,et al.  A general Datalog-based framework for tractable query answering over ontologies , 2012, J. Web Semant..

[41]  Reinhard Pichler,et al.  DEMo: Data Exchange Modeling Tool , 2009, Proc. VLDB Endow..

[42]  Deepak Agarwal Discussion of "Learning Scale Free Networks by Reweighted L1 regularization" , 2011, AISTATS.

[43]  Diego Calvanese,et al.  Ontop: Answering SPARQL queries over relational databases , 2016, Semantic Web.

[44]  Paolo Papotti,et al.  That's All Folks! LLUNATIC Goes Open Source , 2014, Proc. VLDB Endow..

[45]  Emanuel Sallinger,et al.  Limits of Schema Mappings , 2017, Theory of Computing Systems.

[46]  Jean-François Baget,et al.  Graal: A Toolkit for Query Answering with Existential Rules , 2015, RuleML.

[47]  Jan Chomicki,et al.  Consistent query answers in inconsistent databases , 1999, PODS '99.

[48]  Boris Motik,et al.  Benchmarking the Chase , 2017, PODS.

[49]  Andrea Calì,et al.  Taming the Infinite Chase: Query Answering under Expressive Relational Constraints , 2008, Description Logics.

[50]  Emanuel Sallinger,et al.  Winner Determination in Huge Elections with MapReduce , 2017, AAAI.

[51]  Bernd Neumayr,et al.  The VADA Architecture for Cost-Effective Data Wrangling , 2017, SIGMOD Conference.

[52]  Georg Gottlob,et al.  Beyond SPARQL under OWL 2 QL Entailment Regime: Rules to the Rescue , 2015, IJCAI.

[53]  Ronald Fagin,et al.  Composing schema mappings: second-order dependencies to the rescue , 2004, PODS 2004.

[54]  Andrea Calì,et al.  A general datalog-based framework for tractable query answering over ontologies , 2009, SEBD.