RDFRules: Making RDF rule mining easier and even more efficient

AMIE+ is a state-of-the-art algorithm for learning rules from RDF knowledge graphs (KGs). Based on association rule learning, AMIE+ constituted a breakthrough in terms of speed on large data compared to the previous generation of ILP-based systems. In this paper we present several algorithmic extensions to AMIE+, which make it faster, and the support for data pre-processing and model post-processing, which provides a more comprehensive coverage of the linked data mining process than does the original AMIE+ implementation. The main contributions are related to performance improvement: (1) the top-k approach, which addresses the problem of combinatorial explosion often resulting from a hand-set minimum support threshold, (2) a grammar that allows to define fine-grained patterns reducing the size of the search space, and (3) a faster projection binding reducing the number of repetitive calculations. Other enhancements include the possibility to mine across multiple graphs, the support for discretization of continuous values, and the selection of the most representative rules using proven rule pruning and clustering algorithms. Benchmarks show reductions in mining time of up to several orders of magnitude compared to AMIE+. An open-source implementation is available under the name RDFRules at https://github.com/propi/rdfrules.

[1]  Heiner Stuckenschmidt,et al.  Anytime Bottom-Up Rule Learning for Knowledge Graph Completion , 2019, IJCAI.

[2]  Mário A. T. Figueiredo,et al.  Prognostic Prediction Using Clinical Expression Time Series: Towards a Supervised Learning Approach Based on Meta-biclusters , 2012, PACBB.

[3]  Gerhard Weikum,et al.  YAGO: A Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames , 2016, SEMWEB.

[4]  AgrawalRakesh,et al.  Mining association rules between sets of items in large databases , 1993 .

[5]  Jens Lehmann,et al.  RelFinder: Revealing Relationships in RDF Knowledge Bases , 2009, SAMT.

[6]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[7]  Manzil Zaheer,et al.  Probabilistic Case-based Reasoning in Knowledge Bases , 2020, FINDINGS.

[8]  Stephen Muggleton,et al.  Turning 30: New Ideas in Inductive Logic Programming , 2020, IJCAI.

[9]  Denny Vrandecic,et al.  Wikidata: a new platform for collaborative data collection , 2012, WWW.

[10]  Claudia Feregrino Uribe,et al.  Using hashing and lexicographic order for Frequent Itemsets Mining on data streams , 2019, J. Parallel Distributed Comput..

[11]  Hong Shen,et al.  Mining Optimal Class Association Rule Set , 2001, PAKDD.

[12]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[13]  Heiner Stuckenschmidt,et al.  Fine-Grained Evaluation of Rule- and Embedding-Based Systems for Knowledge Graph Completion , 2018, SEMWEB.

[14]  Steffen Staab,et al.  Ontology enrichment by discovering multi-relational association rules from ontological knowledge bases , 2016, SAC.

[15]  Petr Hájek,et al.  The GUHA method and its meaning for data mining , 2010, J. Comput. Syst. Sci..

[16]  R. Agarwal Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[17]  Vojtech Svátek,et al.  RdfRules Preview: Towards an Analytics Engine for Rule Mining in RDF Knowledge Graphs , 2018, RuleML+RR.

[18]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[19]  Fabian M. Suchanek,et al.  Fast rule mining in ontological knowledge bases with AMIE+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+$$\end{docu , 2015, The VLDB Journal.

[20]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[21]  H. Lan,et al.  SWRL : A semantic Web rule language combining OWL and ruleML , 2004 .

[22]  P. Alam,et al.  R , 1823, The Herodotus Encyclopedia.

[23]  Evgeny Kharlamov,et al.  Rule Learning from Knowledge Graphs Guided by Embedding Models , 2018, SEMWEB.

[24]  Johannes Fürnkranz,et al.  A review of possible effects of cognitive biases on interpretation of rule-based machine learning models , 2018, Artif. Intell..

[25]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[26]  Agnieszka Lawrynowicz,et al.  Pattern Based Feature Construction in Semantic Data Mining , 2014, Int. J. Semantic Web Inf. Syst..

[27]  Steven Schockaert,et al.  STRiKE: Rule-Driven Relational Learning Using Stratified k-Entailment , 2020, ECAI.

[28]  Lorenzo Rosasco,et al.  Holographic Embeddings of Knowledge Graphs , 2015, AAAI.

[29]  Jiseong Kim,et al.  The Association Rule Mining System for Acquiring Knowledge of DBpedia from Wikipedia Categories , 2015, NLP-DBPEDIA@ISWC.

[30]  Alex Alves Freitas,et al.  Automated Machine Learning for Studying the Trade-Off Between Predictive Accuracy and Interpretability , 2019, CD-MAKE.

[31]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[32]  Hong Shen,et al.  Mining the optimal class association rule set , 2002, Knowl. Based Syst..

[33]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[34]  Jens Lehmann,et al.  Distributed Semantic Analytics Using the SANSA Stack , 2017, SEMWEB.

[35]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[36]  Carson Kai-Sang Leung Anti-monotone Constraints , 2009, Encyclopedia of Database Systems.

[37]  Heiko Paulheim,et al.  Type Inference on Noisy RDF Data , 2013, SEMWEB.

[38]  P. Alam ‘S’ , 2021, Composites Engineering: An A–Z Guide.

[39]  Wei Zhang,et al.  Iteratively Learning Embeddings and Rules for Knowledge Graph Reasoning , 2019, WWW.

[40]  Matthias Jarke,et al.  Logic Programming and Databases , 1984, Expert Database Workshop.

[41]  Rafael Berlanga Llavori,et al.  Finding association rules in semantic web data , 2012, Knowl. Based Syst..

[42]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[43]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[44]  Kurt Hornik,et al.  The arules R-Package Ecosystem: Analyzing Interesting Patterns from Large Transaction Data Sets , 2011, J. Mach. Learn. Res..

[45]  Michael Hahsler,et al.  Associative Classification in R: arc, arulesCBA, and rCBA , 2019, R J..

[46]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[47]  Johannes Fürnkranz,et al.  Foundations of Rule Learning , 2012, Cognitive Technologies.

[48]  Jaroslav Kuchar,et al.  Tuning Hyperparameters of Classification Based on Associations (CBA) , 2019, ITAT.

[49]  Gebräuchliche Fertigarzneimittel,et al.  V , 1893, Therapielexikon Neurologie.

[50]  Agnieszka Lawrynowicz,et al.  The role of semantics in mining frequent patterns from knowledge bases in description logics with rules , 2010, Theory and Practice of Logic Programming.

[51]  Geoffrey I. Webb Filtered‐top‐k association discovery , 2011, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..

[52]  Qing Liu,et al.  Mining semantic association rules from RDF data , 2017, Knowl. Based Syst..

[53]  이화영 X , 1960, Chinese Plants Names Index 2000-2009.

[54]  Stephen Muggleton,et al.  Learning Higher-Order Logic Programs through Abstraction and Invention , 2016, IJCAI.

[55]  Bart Goethals,et al.  Relational Association Rules: Getting WARMeR , 2002, Pattern Detection and Discovery.

[56]  Danna Zhou,et al.  d. , 1840, Microbial pathogenesis.

[57]  H. Stuckenschmidt,et al.  Reinforced Anytime Bottom Up Rule Learning for Knowledge Graph Completion , 2020, ArXiv.

[58]  Johannes Fürnkranz,et al.  On cognitive preferences and the plausibility of rule-based models , 2018, Machine Learning.

[59]  Minlie Huang,et al.  SSP: Semantic Space Projection for Knowledge Graph Embedding with Text Descriptions , 2016, AAAI.

[60]  Claudia d'Amato,et al.  Evolutionary Discovery of Multi-relational Association Rules from Ontological Knowledge Bases , 2016, EKAW.

[61]  Fabian M. Suchanek,et al.  Fast and Exact Rule Mining with AMIE 3 , 2020, ESWC.

[62]  Zi Yin,et al.  On the Dimensionality of Word Embedding , 2018, NeurIPS.

[63]  Edward Omiecinski,et al.  Alternative Interest Measures for Mining Associations in Databases , 2003, IEEE Trans. Knowl. Data Eng..

[64]  Kewen Wang,et al.  Scalable Rule Learning via Learning Representation , 2018, IJCAI.

[65]  Agnieszka Lawrynowicz,et al.  Swift Linked Data Miner: Mining OWL 2 EL class expressions directly from online RDF datasets , 2017, J. Web Semant..

[66]  Ron Kohavi,et al.  Real world performance of association rule algorithms , 2001, KDD '01.

[67]  Jiawei Han,et al.  TFP: an efficient algorithm for mining top-k frequent closed itemsets , 2005, IEEE Transactions on Knowledge and Data Engineering.

[68]  Koen Vanhoof,et al.  Structure of association rule classifiers: a review , 2010, 2010 IEEE International Conference on Intelligent Systems and Knowledge Engineering.

[69]  Mikolaj Morzy,et al.  Using Substitutive Itemset Mining Framework for Finding Synonymous Properties in Linked Data , 2015, RuleML.

[70]  Nicoleta Preda,et al.  Mining rules to align knowledge bases , 2013, AKBC '13.

[71]  Stephen H. Muggleton,et al.  Can Meta-Interpretive Learning outperform Deep Reinforcement Learning of Evaluable Game strategies? , 2019, IJCAI.

[72]  Madalina Croitoru,et al.  Contextual Itemset Mining in DBpedia , 2014, LD4KD.

[73]  Agnieszka Lawrynowicz,et al.  Fr-ONT: An Algorithm for Frequent Concept Mining with Formal Ontologies , 2011, ISMIS.