Mining semantic association rules from RDF data

The Semantic Web opens up new opportunities for the data mining research. Semantic Web data is usually represented in the RDF triple format (subject, predicate, object). Large RDF-style Knowledge Bases contain hundreds of millions of RDF triples that represent knowledge in a machine-understandable format. Association rule mining is one of the most effective techniques for detecting frequent patterns. In the context of Semantic Web data mining, most existing methods rely on users intervention that is time-consuming and error-prone due to a large amount of data. Meanwhile, rule quality factors (e.g. support and confidence) usually consider knowledge at the instance-level. Namely, these factors disregard the knowledge embedded at the schema-level. In this paper, we demonstrate that ignoring knowledge encoded at the schema-level negatively impacts the interpretation of discovered rules. We introduce an approach called SWARM (Semantic Web Association Rule Mining) that automatically mines Semantic Association Rules from RDF data. The main achievement of SWARM is to reveal common behavioural patterns associated with knowledge at the instance-level and schema-level. We discuss how to utilize knowledge encoded at the schema-level to add more semantics to the rules. We compare the semantic of rules discovered by SWRAM with one of the latest approaches in this field to show the importance of considering schema-level knowledge. Initial experiments performed on RDF-style Knowledge Bases demonstrate the effectiveness of the proposed approach.

[1]  Bart Goethals,et al.  Relational Association Rules: Getting WARMeR , 2002, Pattern Detection and Discovery.

[2]  Hemanta Kumar Kalita,et al.  Semantic Model for Web-Based Big Data Using Ontology and Fuzzy Rule Mining , 2016 .

[3]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[4]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[5]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[6]  Francesca A. Lisi,et al.  Mining the Semantic Web: A Logic-Based Methodology , 2005, ISMIS.

[7]  Nicoleta Preda,et al.  Mining rules to align knowledge bases , 2013, AKBC '13.

[8]  Nada Lavrac,et al.  Network Ranking Assisted Semantic Data Mining , 2016, IWBBIO.

[9]  Luciano Serafini,et al.  Semantic Knowledge Discovery from Heterogeneous Data Sources , 2012, EKAW.

[10]  Felix Naumann,et al.  Amending RDF Entities with New Facts , 2014, KNOW@LOD.

[11]  Felix Naumann,et al.  Improving RDF Data Through Association Rule Mining , 2013, Datenbank-Spektrum.

[12]  Craig A. Knoblock,et al.  Linking and Building Ontologies of Linked Data , 2010, SEMWEB.

[13]  Claudia d'Amato,et al.  Evolutionary Discovery of Multi-relational Association Rules from Ontological Knowledge Bases , 2016, EKAW.

[14]  Qing Liu,et al.  SWARM: An Approach for Mining Semantic Association Rules from Semantic Web Data , 2016, PRICAI.

[15]  Robert Isele,et al.  Learning Expressive Linkage Rules using Genetic Programming , 2012, Proc. VLDB Endow..

[16]  Fabian M. Suchanek,et al.  Fast rule mining in ontological knowledge bases with AMIE+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+$$\end{docu , 2015, The VLDB Journal.

[17]  Nicola Fanizzi,et al.  Inductive learning for the Semantic Web: What does it buy? , 2010, Semantic Web.

[18]  Fabian M. Suchanek,et al.  Canonicalizing Open Knowledge Bases , 2014, CIKM.

[19]  LiuQing,et al.  Mining semantic association rules from RDF data , 2017 .

[20]  H. Lan,et al.  SWRL : A semantic Web rule language combining OWL and ruleML , 2004 .

[21]  Gerhard Weikum,et al.  MING: mining informative entity relationship subgraphs , 2009, CIKM.

[22]  M. Srinivasa Rao,et al.  Survey on Techniques for Ontology Interoperability in Semantic Web , 2014 .

[23]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[24]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993, Knowl. Acquis..

[25]  Hao Wang,et al.  Semantic data mining: A survey of ontology-based approaches , 2015, Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015).

[26]  Rafael Berlanga Llavori,et al.  Finding association rules in semantic web data , 2012, Knowl. Based Syst..

[27]  Marcin Sydow,et al.  The notion of diversity in graphical entity summarisation on semantic knowledge graphs , 2013, Journal of Intelligent Information Systems.

[28]  Serge Abiteboul,et al.  PARIS: Probabilistic Alignment of Relations, Instances, and Schema , 2011, Proc. VLDB Endow..

[29]  Nicola Fanizzi,et al.  Approximating Numeric Role Fillers via Predictive Clustering Trees for Knowledge Base Enrichment in the Web of Data , 2016, DS.

[30]  Shamim Ripon,et al.  Knowledge-based Data Mining Using Semantic Web☆ , 2014 .

[31]  Fabian M. Suchanek,et al.  AMIE: association rule mining under incomplete evidence in ontological knowledge bases , 2013, WWW.

[32]  Franco Turini,et al.  Classification Rule Mining Supported by Ontology for Discrimination Discovery , 2016, 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW).

[33]  Nittaya Kerdprasop,et al.  Data Mining in Semantic Web Data , 2014 .

[34]  Gerhard Weikum,et al.  LINDA: distributed web-of-data-scale entity matching , 2012, CIKM.

[35]  Man Zhu,et al.  Ontology Learning from Incomplete Semantic Web Data by BelNet , 2013, 2013 IEEE 25th International Conference on Tools with Artificial Intelligence.

[36]  Xiang Zhang,et al.  Mining Link Patterns in Linked Data , 2012, WAIM.

[37]  Isabelle Mirbel,et al.  DFS-based frequent graph pattern extraction to characterize the content of RDF Triple Stores , 2010 .

[38]  Deborah L. McGuinness,et al.  SameAs Networks and Beyond: Analyzing Deployment Status and Implications of owl: sameAs in Linked Data , 2010, International Semantic Web Conference.

[39]  Suresh Jain,et al.  Ontology-based Information Extraction: An Overview and a Study of different Approaches , 2014 .

[40]  Robert Isele,et al.  Efficient Multidimensional Blocking for Link Discovery without losing Recall , 2011, WebDB.

[41]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[42]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[43]  Nicoleta Preda,et al.  Recent Topics of Research around the YAGO Knowledge Base , 2014, APWeb.

[44]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[45]  Martin Gaedke,et al.  Silk - A Link Discovery Framework for the Web of Data , 2009, LDOW.

[46]  Samir Khuller,et al.  Link Prediction for Annotation Graphs Using Graph Summarization , 2011, SEMWEB.

[47]  Philip S. Yu,et al.  Graph indexing: a frequent structure-based approach , 2004, SIGMOD '04.

[48]  Johanna Völker,et al.  Statistical Schema Induction , 2011, ESWC.

[49]  Jiawei Han,et al.  Frequent pattern mining: current status and future directions , 2007, Data Mining and Knowledge Discovery.