Mandolin: A Knowledge Discovery Framework for the Web of Data

Markov Logic Networks join probabilistic modeling with first-order logic and have been shown to integrate well with the Semantic Web foundations. While several approaches have been devised to tackle the subproblems of rule mining, grounding, and inference, no comprehensive workflow has been proposed so far. In this paper, we fill this gap by introducing a framework called Mandolin, which implements a workflow for knowledge discovery specifically on RDF datasets. Our framework imports knowledge from referenced graphs, creates similarity relationships among similar literals, and relies on state-of-the-art techniques for rule mining, grounding, and inference computation. We show that our best configuration scales well and achieves at least comparable results with respect to other statistical-relational-learning algorithms on link prediction.

[1]  Yi Li,et al.  RiMOM: A Dynamic Multistrategy Ontology Alignment Framework , 2009, IEEE Transactions on Knowledge and Data Engineering.

[2]  Luis Gravano,et al.  Approximate String Joins in a Database (Almost) for Free , 2001, VLDB.

[3]  Christopher Ré,et al.  Tuffy: Scaling up Statistical Inference in Markov Logic Networks using an RDBMS , 2011, Proc. VLDB Endow..

[4]  Foster Provost,et al.  NetKit-SRL: A Toolkit for Network Learning and Inference , 2005 .

[5]  Philipp Cimiano,et al.  A Machine Learning Approach to Multilingual and Cross-Lingual Ontology Matching , 2011, SEMWEB.

[6]  Bernardo Cuenca Grau,et al.  LogMap: Logic-Based and Scalable Ontology Matching , 2011, SEMWEB.

[7]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[8]  Yu Hao,et al.  TransG : A Generative Mixture Model for Knowledge Graph Embedding , 2015, ArXiv.

[9]  Sören Auer,et al.  SINA: Semantic interpretation of user queries for question answering on interlinked data , 2015, J. Web Semant..

[10]  Pedro M. Domingos,et al.  Entity Resolution with Markov Logic , 2006, Sixth International Conference on Data Mining (ICDM'06).

[11]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[12]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[13]  Dan Roth,et al.  On the Hardness of Approximate Reasoning , 1993, IJCAI.

[14]  Axel-Cyrille Ngonga Ngomo,et al.  An optimization approach for load balancing in parallel link discovery , 2015, SEMANTiCS.

[15]  Matthew Richardson,et al.  The Alchemy System for Statistical Relational AI: User Manual , 2007 .

[16]  Li Guo,et al.  Knowledge Base Completion Using Embeddings and Rules , 2015, IJCAI.

[17]  Jens Lehmann,et al.  Inductive Lexical Learning of Class Expressions , 2014, EKAW.

[18]  Heiner Stuckenschmidt,et al.  RockIt: Exploiting Parallelism and Symmetry for MAP Inference in Statistical Relational Models , 2013, AAAI.

[19]  Martin Gaedke,et al.  Silk - A Link Discovery Framework for the Web of Data , 2009, LDOW.

[20]  Andreas Thor,et al.  Evaluation of entity resolution approaches on real-world match problems , 2010, Proc. VLDB Endow..

[21]  Pável Calado,et al.  Structure-based inference of xml similarity for fuzzy duplicate detection , 2007, CIKM '07.

[22]  Axel-Cyrille Ngonga Ngomo,et al.  Link Discovery with Guaranteed Reduction Ratio in Affine Spaces with Minkowski Measures , 2012, SEMWEB.

[23]  Lise Getoor,et al.  A Latent Dirichlet Model for Unsupervised Entity Resolution , 2005, SDM.

[24]  Matthew Rowe,et al.  Predicting Discussions on the Social Semantic Web , 2011, ESWC.

[25]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[26]  Changjun Jiang,et al.  GAOM: Genetic Algorithm Based Ontology Matching , 2006, 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC'06).

[27]  Axel-Cyrille Ngonga Ngomo,et al.  SCMS - Semantifying Content Management Systems , 2011, SEMWEB.

[28]  Xueyan Jiang,et al.  Reducing the Rank in Relational Factorization Models by Including Observable Patterns , 2014, NIPS.

[29]  Jeffrey Xu Yu,et al.  Efficient similarity joins for near-duplicate detection , 2011, TODS.

[30]  Sören Auer,et al.  LIMES - A Time-Efficient Approach for Large-Scale Link Discovery on the Web of Data , 2011, IJCAI.

[31]  Yang Chen,et al.  Web-Scale Knowledge Inference Using Markov Logic Networks , 2013 .

[32]  Cecilia Mascolo,et al.  Exploiting place features in link prediction on location-based social networks , 2011, KDD.

[33]  Michael R. Genesereth,et al.  Logical foundations of artificial intelligence , 1987 .

[34]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[35]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[36]  Axel-Cyrille Ngonga Ngomo,et al.  Detecting Similar Linked Datasets Using Topic Modelling , 2016, ESWC.

[37]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[38]  Matthew Richardson,et al.  Just Add Weights: Markov Logic for the Semantic Web , 2008, URSW.

[39]  Lorenzo Rosasco,et al.  Holographic Embeddings of Knowledge Graphs , 2015, AAAI.

[40]  Pierre-Yves Schobbens,et al.  ArThUR: A Tool for Markov Logic Network , 2014, OTM Workshops.

[41]  Robert Isele,et al.  Silk - Generating RDF Links while Publishing or Consuming Linked Data , 2010, SEMWEB.

[42]  Fabian M. Suchanek,et al.  Fast rule mining in ontological knowledge bases with AMIE+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+$$\end{docu , 2015, The VLDB Journal.

[43]  Axel-Cyrille Ngonga Ngomo,et al.  EAGLE: Efficient Active Learning of Link Specifications Using Genetic Programming , 2012, ESWC.

[44]  Beat Wüthrich Probabilistic Knowledge Bases , 1995, IEEE Trans. Knowl. Data Eng..

[45]  Daisy Zhe Wang,et al.  Knowledge expansion over probabilistic knowledge bases , 2014, SIGMOD Conference.

[46]  Enrique Alba,et al.  Optimizing Ontology Alignments by Using Genetic Algorithms , 2008, NatuReS.

[47]  Enrico Motta,et al.  Unsupervised Learning of Link Discovery Configuration , 2012, ESWC.