Mining Rules Incrementally over Large Knowledge Bases

Multiple web-scale Knowledge Bases, e.g., Freebase, YAGO, NELL, have been constructed using semi-supervised or unsupervised information extraction techniques and many of them, despite their large sizes, are continuously growing. Much research effort has been put into mining inference rules from knowledge bases. To address the task of rule mining over evolving web-scale knowledge bases, we propose a parallel incremental rule mining framework. Our approach is able to efficiently mine rules based on the relational model and apply updates to large knowledge bases; we propose an alternative metric that reduces computation complexity without compromising quality; we apply multiple optimization techniques that reduce runtime by more than 2 orders of magnitude. Experiments show that our approach efficiently scales to web-scale knowledge bases and saves over 90% time compared to the state-of-the-art batch rule mining system. We also apply our optimization techniques to the batch rule mining algorithm, reducing runtime by more than half compared to the state-of-the-art. To the best of our knowledge, our incremental rule mining system is the first that handles updates to web-scale knowledge bases.

[1]  Daisy Zhe Wang,et al.  Ontological Pathfinding , 2016, SIGMOD Conference.

[2]  Michael Gamon,et al.  Representing Text for Joint Embedding of Text and Knowledge Bases , 2015, EMNLP.

[3]  G. Glass,et al.  Machine learning approaches in GIS-based ecological modeling of the sand fly Phlebotomus papatasi, a vector of zoonotic cutaneous leishmaniasis in Golestan province, Iran. , 2018, Acta tropica.

[4]  Gerhard Weikum,et al.  YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia: Extended Abstract , 2013, IJCAI.

[5]  Alfred Horn,et al.  On sentences which are true of direct unions of algebras , 1951, Journal of Symbolic Logic.

[6]  Xin Wang,et al.  Association Rules with Graph Patterns , 2015, Proc. VLDB Endow..

[7]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[8]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[9]  Reynold Xin,et al.  Apache Spark , 2016 .

[10]  Christopher De Sa,et al.  Incremental Knowledge Base Construction Using DeepDive , 2015, The VLDB Journal.

[11]  Ilker Demirkol,et al.  Enhanced Handover Signaling through Integrated MME-SDN Controller Solution , 2018, 2018 IEEE 87th Vehicular Technology Conference (VTC Spring).

[12]  Christopher Ré,et al.  Elementary: Large-Scale Knowledge-Base Construction via Machine Learning and Statistical Inference , 2012, Int. J. Semantic Web Inf. Syst..

[13]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[14]  Ana Gabriela Maguitman,et al.  ArgueNet: an argument-based recommender system for solving Web search queries , 2004, 2004 2nd International IEEE Conference on 'Intelligent Systems'. Proceedings (IEEE Cat. No.04EX791).

[15]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[16]  Fabian M. Suchanek,et al.  Fast rule mining in ontological knowledge bases with AMIE+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+$$\end{docu , 2015, The VLDB Journal.

[17]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[18]  Paolo Papotti,et al.  Robust Discovery of Positive and Negative Rules in Knowledge Bases , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[19]  Miguel E. Rodríguez,et al.  Temporal Reasoning Over Event Knowledge Graphs , 2018 .

[20]  Daisy Zhe Wang,et al.  Knowledge expansion over probabilistic knowledge bases , 2014, SIGMOD Conference.

[21]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[22]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[23]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.