Transformers as Soft Reasoners over Language

Beginning with McCarthy's Advice Taker (1959), AI has pursued the goal of providing a system with explicit, general knowledge and having the system reason over that knowledge. However, expressing the knowledge in a formal (logical or probabilistic) representation has been a major obstacle to this research. This paper investigates a modern approach to this problem where the facts and rules are provided as natural language sentences, thus bypassing a formal representation. We train transformers to reason (or emulate reasoning) over these sentences using synthetically generated data. Our models, that we call RuleTakers, provide the first empirical demonstration that this kind of soft reasoning over language is learnable, can achieve high (99%) accuracy, and generalizes to test data requiring substantially deeper chaining than seen during training (95%+ scores). We also demonstrate that the models transfer well to two hand-authored rulebases, and to rulebases paraphrased into more natural language. These findings are significant as it suggests a new role for transformers, namely as limited "soft theorem provers" operating over explicit theories in language. This in turn suggests new possibilities for explainability, correctability, and counterfactual reasoning in question-answering.

[1]  Johan Bos,et al.  Computing Meaning , 1999 .

[2]  Adam Tauman Kalai,et al.  ADAPTIVE GENERATION OF PROGRAMMING PUZZLES , 2019 .

[3]  Allen Newell,et al.  The logic theory machine-A complex information processing system , 1956, IRE Trans. Inf. Theory.

[4]  Samuel R. Bowman,et al.  Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks , 2018, ArXiv.

[5]  Peter Clark,et al.  QuaRTz: An Open-Domain Dataset of Qualitative Relationship Questions , 2019, EMNLP.

[6]  David L. Dill,et al.  Learning a SAT Solver from Single-Bit Supervision , 2018, ICLR.

[7]  Christopher D. Manning,et al.  Natural Logic and Natural Language Inference , 2014 .

[8]  Joelle Pineau,et al.  CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text , 2019, EMNLP.

[9]  Yoshua Bengio,et al.  HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[10]  Han He,et al.  Establishing Strong Baselines for the New Decade: Sequence Tagging, Syntactic and Semantic Parsing with BERT , 2019, FLAIRS.

[11]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[12]  Adrian Walker,et al.  Towards a Theory of Declarative Knowledge , 1988, Foundations of Deductive Databases and Logic Programming..

[13]  Jason Weston,et al.  Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.

[14]  Pasquale Minervini,et al.  Towards Neural Theorem Proving at Scale , 2018, ArXiv.

[15]  Yorick Wilks,et al.  Natural language inference. , 1973 .

[16]  Kostas S. Metaxiotis,et al.  Expert systems in production planning and scheduling: A state-of-the-art survey , 2002, J. Intell. Manuf..

[17]  Guokun Lai,et al.  RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.

[18]  Jingbo Zhu,et al.  Learning Deep Transformer Models for Machine Translation , 2019, ACL.

[19]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[20]  Thomas Lukasiewicz,et al.  Learning to Reason: Leveraging Neural Networks for Approximate DNF Counting , 2019, AAAI.

[21]  John McCarthy,et al.  Applications of Circumscription to Formalizing Common Sense Knowledge , 1987, NMR.

[22]  D. W. Davies,et al.  Mechanization of thought Processes , 1959, Nature.

[23]  Christine Froidevaux,et al.  General Logical Databases and Programs: Default Logic Semantics and Stratification , 1991, Inf. Comput..

[24]  Clayton T. Morrison,et al.  WorldTree: A Corpus of Explanation Graphs for Elementary Science Questions supporting Multi-hop Inference , 2018, LREC.

[25]  Ilkka Niemelä,et al.  Smodels - An Implementation of the Stable Model and Well-Founded Semantics for Normal LP , 1997, LPNMR.

[26]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[27]  Edward Grefenstette,et al.  Differentiable Reasoning on Large Knowledge Bases and Natural Language , 2019, Knowledge Graphs for eXplainable Artificial Intelligence.

[28]  J. Meigs,et al.  WHO Technical Report , 1954, The Yale Journal of Biology and Medicine.

[29]  L. Goddard Information Theory , 1962, Nature.

[30]  Rajarshi Das,et al.  A Survey on Semantic Parsing , 2018, AKBC.

[31]  Ashish Sabharwal,et al.  What Does My QA Model Know? Devising Controlled Probes Using Expert Knowledge , 2019, Transactions of the Association for Computational Linguistics.

[32]  Guillaume Lample,et al.  Deep Learning for Symbolic Mathematics , 2019, ICLR.

[33]  Lawrence S. Moss,et al.  Probing Natural Language Inference Models through Semantic Fragments , 2020, AAAI.

[34]  Ido Dagan,et al.  Recognizing Textual Entailment: Models and Applications , 2013, Recognizing Textual Entailment: Models and Applications.

[35]  Mark A. Musen,et al.  Of Brittleness and Bottlenecks: Challenges in the Creation of Pattern-Recognition and Expert-System Models , 1988 .

[36]  Ian J. Goodfellow,et al.  NIPS 2016 Tutorial: Generative Adversarial Networks , 2016, ArXiv.

[37]  Lawrence S. Moss,et al.  Natural Logic and Semantics , 2009, Amsterdam Colloquium on Logic, Language and Meaning.

[38]  Lingfei Wu,et al.  Improving Graph Neural Network Representations of Logical Formulae with Subgraph Pooling , 2019, ArXiv.

[39]  Achille Fokoue,et al.  An Experimental Study of Formula Embeddings for Automated Theorem Proving in First-Order Logic , 2020, ArXiv.

[40]  Claire Cardie,et al.  Improving Machine Reading Comprehension with General Reading Strategies , 2018, NAACL.

[41]  Zhen-Hua Ling,et al.  Enhancing and Combining Sequential and Tree LSTM for Natural Language Inference , 2016, ArXiv.

[42]  Zhen-Hua Ling,et al.  Enhanced LSTM for Natural Language Inference , 2016, ACL.

[43]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[44]  Ulf Leser,et al.  NLProlog: Reasoning with Weak Unification for Question Answering in Natural Language , 2019, ACL.

[45]  M. de Rijke Computing with Meaning , 2001 .

[46]  Kevin Lin,et al.  Reasoning Over Paragraph Effects in Situations , 2019, MRQA@EMNLP.

[47]  John McCarthy,et al.  Programs with common sense , 1960 .

[48]  Jonathan Berant,et al.  oLMpics-On What Language Model Pre-training Captures , 2019, Transactions of the Association for Computational Linguistics.

[49]  Pushmeet Kohli,et al.  Analysing Mathematical Reasoning Abilities of Neural Models , 2019, ICLR.