Differentiable Learning of Logical Rules for Knowledge Base Reasoning

We study the problem of learning probabilistic first-order logical rules for knowledge base reasoning. This learning problem is difficult because it requires learning the parameters in a continuous space as well as the structure in a discrete space. We propose a framework, Neural Logic Programming, that combines the parameter and structure learning of first-order logical rules in an end-to-end differentiable model. This approach is inspired by a recently-developed differentiable logic called TensorLog, where inference tasks can be compiled into sequences of differentiable operations. We design a neural controller system that learns to compose these operations. Empirically, our method outperforms prior work on multiple knowledge base benchmark datasets, including Freebase and WikiMovies.

[1]  Yelong Shen,et al.  Implicit ReasoNet: Modeling Large-Scale Structured Relationships with Shared Memory , 2017, ArXiv.

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Andrew McCallum,et al.  Introduction to Statistical Relational Learning , 2007 .

[4]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[5]  Ye Yuan,et al.  Words or Characters? Fine-grained Gating for Reading Comprehension , 2016, ICLR.

[6]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[7]  S. Muggleton Stochastic Logic Programs , 1996 .

[8]  Jason Weston,et al.  Question Answering with Subgraph Embeddings , 2014, EMNLP.

[9]  Huanbo Luan,et al.  Modeling Relation Paths for Representation Learning of Knowledge Bases , 2015, EMNLP.

[10]  William Yang Wang,et al.  Structure Learning via Parameter Learning , 2014, CIKM.

[11]  Tom M. Mitchell,et al.  Incorporating Vector Space Similarity in Random Walk Inference over Knowledge Bases , 2014, EMNLP.

[12]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[13]  Dan Klein,et al.  Learning to Compose Neural Networks for Question Answering , 2016, NAACL.

[14]  De Raedt,et al.  Advances in Inductive Logic Programming , 1996 .

[15]  Tom M. Mitchell,et al.  Random Walk Inference and Learning in A Large Scale Knowledge Base , 2011, EMNLP.

[16]  William W. Cohen TensorLog: A Differentiable Deductive Database , 2016, ArXiv.

[17]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[18]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[19]  Ni Lao,et al.  Relational retrieval using a combination of path-constrained random walks , 2010, Machine Learning.

[20]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[21]  Quoc V. Le,et al.  Neural Programmer: Inducing Latent Programs with Gradient Descent , 2015, ICLR.

[22]  W. Denham The Detection of Patterns in Alyawarra Nonverbal Behavior , 2014 .

[23]  Danqi Chen,et al.  Observed versus latent features for knowledge base and text inference , 2015, CVSC.

[24]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[25]  Pedro M. Domingos,et al.  Statistical predicate invention , 2007, ICML '07.

[26]  Ni Lao,et al.  Efficient inference and learning in a large knowledge base , 2015, Machine Learning.

[27]  William W. Cohen,et al.  Polynomial learnability and Inductive Logic Programming: Methods and results , 1995, New Generation Computing.

[28]  Ben Taskar,et al.  Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning) , 2007 .

[29]  Jason Weston,et al.  Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[30]  Martín Abadi,et al.  Learning a Natural Language Interface with Neural Programmer , 2016, ICLR.

[31]  William Yang Wang,et al.  Programming with personalized pagerank: a locally groundable first-order probabilistic logic , 2013, CIKM.

[32]  Stephen Muggleton,et al.  Inductive Logic Programming , 2011, Lecture Notes in Computer Science.

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[35]  Yelong Shen,et al.  Link Prediction using Embedded Knowledge Graphs , 2016 .

[36]  Ruslan Salakhutdinov,et al.  Gated-Attention Readers for Text Comprehension , 2016, ACL.

[37]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[38]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[39]  Pedro M. Domingos,et al.  Learning the structure of Markov logic networks , 2005, ICML.