NLProlog: Reasoning with Weak Unification for Question Answering in Natural Language

Rule-based models are attractive for various tasks because they inherently lead to interpretable and explainable decisions and can easily incorporate prior knowledge. However, such systems are difficult to apply to problems involving natural language, due to its large linguistic variability. In contrast, neural models can cope very well with ambiguity by learning distributed representations of words and their composition from data, but lead to models that are difficult to interpret. In this paper, we describe a model combining neural networks with logic programming in a novel manner for solving multi-hop reasoning tasks over natural language. Specifically, we propose to use an Prolog prover which we extend to utilize a similarity function over pretrained sentence encoders. We fine-tune the representations for the similarity function via backpropagation. This leads to a system that can apply rule-based reasoning to natural language, and induce domain-specific natural language rules from training data. We evaluate the proposed system on two different question answering tasks, showing that it outperforms two baselines – BiDAF (Seo et al., 2016a) and FastQA( Weissenborn et al., 2017) on a subset of the WikiHop corpus and achieves competitive results on the MedHop data set (Welbl et al., 2017).

[1]  Franco Turini,et al.  A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[2]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[3]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[4]  Richard Socher,et al.  Coarse-grain Fine-grain Coattention Network for Multi-evidence Question Answering , 2019, ICLR.

[5]  André Freitas,et al.  A Survey on Open Information Extraction , 2018, COLING.

[6]  Been Kim,et al.  Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[7]  Clemente Rubio-Manzano,et al.  Bousi~Prolog: a Prolog Extension Language for Flexible Query Answering , 2009, PROLE.

[8]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[9]  Katrin Erk,et al.  Probabilistic Soft Logic for Semantic Textual Similarity , 2014, ACL.

[10]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[11]  Maria I. Sessa,et al.  Approximate reasoning by similarity-based SLD resolution , 2002, Theor. Comput. Sci..

[12]  Katrin Erk,et al.  Integrating Logical Representations with Probabilistic Information using Markov Logic , 2011, IWCS.

[13]  Jason Weston,et al.  Tracking the World State with Recurrent Entity Networks , 2016, ICLR.

[14]  Andrew McCallum,et al.  Compositional Vector Space Models for Knowledge Base Completion , 2015, ACL.

[15]  Sebastian Riedel,et al.  Constructing Datasets for Multi-hop Reading Comprehension Across Documents , 2017, TACL.

[16]  Isabelle Augenstein,et al.  Jack the Reader – A Machine Reading Framework , 2018, ACL.

[17]  Tapio Salakoski,et al.  Distributional Semantics Resources for Biomedical Text Processing , 2013 .

[18]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[19]  Chris Dyer,et al.  Dynamic Integration of Background Knowledge in Neural NLU Systems , 2017, 1706.02596.

[20]  Sameer Singh,et al.  Injecting Logical Background Knowledge into Embeddings for Relation Extraction , 2015, NAACL.

[21]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[22]  Stephen H. Bach,et al.  Hinge-Loss Markov Random Fields and Probabilistic Soft Logic , 2015, J. Mach. Learn. Res..

[23]  Rajarshi Das,et al.  Weaver: Deep Co-Encoding of Questions and Documents for Machine Reading , 2018, ArXiv.

[24]  Cuong Chau,et al.  Montague Meets Markov: Deep Semantics with Probabilistic Logical Form , 2013, *SEMEVAL.

[25]  Ruslan Salakhutdinov,et al.  Neural Models for Reasoning over Multiple Mentions Using Coreference , 2018, NAACL.

[26]  Jennifer Chu-Carroll,et al.  Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..

[27]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[28]  Katrin Erk,et al.  A Formal Approach to Linking Logical Form and Vector-Space Lexical Semantics , 2014 .

[29]  Tim Rocktäschel,et al.  End-to-end Differentiable Proving , 2017, NIPS.

[30]  Rajarshi Das,et al.  Chains of Reasoning over Entities, Relations, and Text using Recurrent Neural Networks , 2016, EACL.

[31]  M. Gupta,et al.  Theory of T -norms and fuzzy inference methods , 1991 .

[32]  Oren Etzioni,et al.  Open question answering over curated and extracted knowledge bases , 2014, KDD.

[33]  Christopher D. Manning,et al.  Combining Natural Logic and Shallow Reasoning for Question Answering , 2016, ACL.

[34]  Ali Farhadi,et al.  Query-Reduction Networks for Question Answering , 2016, ICLR.

[35]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36]  Dirk Weissenborn,et al.  Making Neural QA as Simple as Possible but not Simpler , 2017, CoNLL.

[37]  Kai-Uwe Kühnberger,et al.  Neural-Symbolic Learning and Reasoning: A Survey and Interpretation , 2017, Neuro-Symbolic Artificial Intelligence.

[38]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[39]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[40]  Sanda M. Harabagiu,et al.  COGEX: A Logic Prover for Question Answering , 2003, NAACL.

[41]  Nicola De Cao,et al.  Question Answering by Reasoning Across Documents with Graph Convolutional Networks , 2018, NAACL.

[42]  Stephen Muggleton,et al.  Inductive Logic Programming , 2011, Lecture Notes in Computer Science.

[43]  Alexander J. Smola,et al.  Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning , 2017, ICLR.

[44]  Matteo Pagliardini,et al.  Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features , 2017, NAACL.

[45]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[46]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[47]  Kam-Fai Wong,et al.  Towards Neural Network-based Reasoning , 2015, ArXiv.

[48]  Dirk Weissenborn,et al.  FastQA: A Simple and Efficient Neural Architecture for Question Answering , 2017, ArXiv.

[49]  Wanli Ma,et al.  An Overview of Temporal and Modal Logic Programming , 1994, ICTL.

[50]  Richard Evans,et al.  Learning Explanatory Rules from Noisy Data , 2017, J. Artif. Intell. Res..

[51]  William W. Cohen TensorLog: A Differentiable Deductive Database , 2016, ArXiv.

[52]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.