Social Commonsense Reasoning with Multi-Head Knowledge Attention

Social Commonsense Reasoning requires understanding of text, knowledge about social events and their pragmatic implications, as well as commonsense reasoning skills. In this work we propose a novel multi-head knowledge attention model that encodes semi-structured commonsense inference rules and learns to incorporate them in a transformer-based reasoning cell. We assess the model's performance on two tasks that require different reasoning skills: Abductive Natural Language Inference and Counterfactual Invariance Prediction as a new task. We show that our proposed model improves performance over strong state-of-the-art models (i.e., RoBERTa) across both reasoning tasks. Notably we are, to the best of our knowledge, the first to demonstrate that a model that learns to perform counterfactual reasoning helps predicting the best explanation in an abductive reasoning task. We validate the robustness of the model's reasoning capabilities by perturbing the knowledge and provide qualitative analysis on the model's knowledge incorporation capabilities.

[1]  Dongyan Zhao,et al.  Question Answering on Freebase via Relation Extraction and Textual Evidence , 2016, ACL.

[2]  Yejin Choi,et al.  Dynamic Knowledge Graph Construction for Zero-shot Commonsense Question Answering , 2019, ArXiv.

[3]  Jimmy J. Lin,et al.  Simple BERT Models for Relation Extraction and Semantic Role Labeling , 2019, ArXiv.

[4]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[5]  Ernest Davis,et al.  Commonsense reasoning and commonsense knowledge in artificial intelligence , 2015, Commun. ACM.

[6]  Yejin Choi,et al.  Unsupervised Commonsense Question Answering with Self-Talk , 2020, EMNLP.

[7]  Todor Mihaylov,et al.  Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension with External Commonsense Knowledge , 2018, ACL.

[8]  Henry Lieberman,et al.  EventNet: Inferring Temporal Relations Between Commonsense Events , 2005, MICAI.

[9]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[10]  Zhen Xu,et al.  Incorporating loose-structured knowledge into conversation modeling via recall-gate LSTM , 2016, 2017 International Joint Conference on Neural Networks (IJCNN).

[11]  Leonhard Hennig,et al.  Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction , 2019, ACL.

[12]  Yejin Choi,et al.  Event2Mind: Commonsense Inference on Events, Intents, and Reactions , 2018, ACL.

[13]  Yejin Choi,et al.  Commonsense Knowledge Base Completion with Structural and Semantic Context , 2020, AAAI.

[14]  Quoc V. Le,et al.  A Simple Method for Commonsense Reasoning , 2018, ArXiv.

[15]  N. Roese,et al.  The Functional Theory of Counterfactual Thinking , 2008, Personality and social psychology review : an official journal of the Society for Personality and Social Psychology, Inc.

[16]  Xiang Ren,et al.  KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning , 2019, EMNLP.

[17]  Luke S. Zettlemoyer,et al.  AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.

[18]  Nathanael Chambers,et al.  A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories , 2016, NAACL.

[19]  Jonathan Berant,et al.  CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge , 2019, NAACL.

[20]  Chitta Baral,et al.  Exploring ways to incorporate additional knowledge to improve Natural Language Commonsense Question Answering , 2019, ArXiv.

[21]  Rakesh Gupta,et al.  Commonsense Reasoning about Task Instructions , 2005 .

[22]  Gerhard Weikum,et al.  WebChild 2.0 : Fine-Grained Commonsense Knowledge Distillation , 2017, ACL.

[23]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[24]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[25]  Xiaoyan Wang,et al.  Improving Natural Language Inference Using External Knowledge in the Science Questions Domain , 2018, AAAI.

[26]  Joyce Yue Chai,et al.  Commonsense Reasoning for Natural Language Understanding: A Survey of Benchmarks, Resources, and Approaches , 2019, ArXiv.

[27]  John McCarthy,et al.  Programs with common sense , 1960 .

[28]  Yejin Choi,et al.  Counterfactual Story Reasoning and Generation , 2019, EMNLP.

[29]  D. Hilton,et al.  The Psychology of Counterfactual Thinking , 2005 .

[30]  Xiang Li,et al.  Commonsense Knowledge Base Completion , 2016, ACL.

[31]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[32]  Yejin Choi,et al.  COMET: Commonsense Transformers for Automatic Knowledge Graph Construction , 2019, ACL.

[33]  Ron Sun,et al.  Robust Reasoning: Integrating Rule-Based and Similarity-Based Reasoning , 1995, Artif. Intell..

[34]  Ali Farhadi,et al.  HellaSwag: Can a Machine Really Finish Your Sentence? , 2019, ACL.

[35]  Thomas Lukasiewicz,et al.  A Surprisingly Robust Trick for the Winograd Schema Challenge , 2019, ACL.

[36]  Debjit PAUL,et al.  Argumentative Relation Classification with Background Knowledge , 2020, COMMA.

[37]  Nathanael Chambers,et al.  Unsupervised Learning of Narrative Event Chains , 2008, ACL.

[38]  Jonas Peters,et al.  Causal inference by using invariant prediction: identification and confidence intervals , 2015, 1501.01332.

[39]  Hector J. Levesque,et al.  The Winograd Schema Challenge , 2011, AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning.

[40]  Chris Dyer,et al.  Dynamic Integration of Background Knowledge in Neural NLU Systems , 2017, 1706.02596.

[41]  Jennifer Chu-Carroll,et al.  GLUCOSE: GeneraLized and COntextualized Story Explanations , 2020, EMNLP.

[42]  Yejin Choi,et al.  ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning , 2019, AAAI.

[43]  Yanyan Lan,et al.  L2R²: Leveraging Ranking for Abductive Reasoning , 2020, SIGIR.

[44]  Lukasz Kaiser,et al.  Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.

[45]  Anette Frank,et al.  Ranking and Selecting Multi-Hop Knowledge Paths to Better Predict Human Needs , 2019, NAACL.

[46]  Mohit Bansal,et al.  Commonsense for Generative Multi-Hop Question Answering Tasks , 2018, EMNLP.

[47]  Bhavana Dalvi,et al.  Reasoning about Actions and State Changes by Injecting Commonsense Knowledge , 2018, EMNLP.

[48]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[49]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[50]  Adam Trischler,et al.  How Reasonable are Common-Sense Reasoning Tasks: A Case-Study on the Winograd Schema Challenge and SWAG , 2018, EMNLP.

[51]  Doug Downey,et al.  Abductive Commonsense Reasoning , 2019, ICLR.

[52]  Yu Hu,et al.  Cause-Effect Knowledge Acquisition and Neural Association Model for Solving A Set of Winograd Schema Problems , 2017, IJCAI.

[53]  Peter Schüller,et al.  Tackling Winograd Schemas by Formalizing Relevance Theory in Knowledge Graphs , 2014, KR.

[54]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[55]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.