Multi-Hop Fact Checking of Political Claims

Recently, novel multi-hop models and datasets have been introduced to achieve more complex natural language reasoning with neural networks. One notable task that requires multi-hop reasoning is fact checking, where a chain of connected evidence pieces leads to the final verdict of a claim. However, existing datasets do not provide annotations for the gold evidence pieces, which is a critical aspect for improving the explainability of fact checking systems. The only exception is the FEVER dataset, which is artificially constructed based on Wikipedia and does not use naturally occurring political claims and evidence pages, which is more challenging. Most claims in FEVER only have one evidence sentence associated with them and require no reasoning to make label predictions -- the small number of instances with two evidence sentences only require simple reasoning. In this paper, we study how to perform more complex claim verification on naturally occurring claims with multiple hops over evidence chunks. We first construct a small annotated dataset, PolitiHop, of reasoning chains for claim verification. We then compare the dataset to other existing multi-hop datasets and study how to transfer knowledge from more extensive in- and out-of-domain resources to PolitiHop. We find that the task is complex, and achieve the best performance using an architecture that specifically models reasoning over evidence chains in combination with in-domain transfer learning.

[1]  Smaranda Muresan,et al.  DeSePtion: Dual Sequence Prediction and Adversarial Examples for Improved Fact-Checking , 2020, ACL.

[2]  William Yang Wang “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection , 2017, ACL.

[3]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[4]  Andreas Vlachos,et al.  Fact Checking: Task definition and dataset construction , 2014, LTCSS@ACL.

[5]  Christian Hansen,et al.  MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims , 2019, EMNLP.

[6]  Smaranda Muresan,et al.  Where is Your Evidence: Improving Fact-checking by Justification Modeling , 2018 .

[7]  Maosong Sun,et al.  GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification , 2019, ACL.

[8]  Jakob Grue Simonsen,et al.  Generating Fact Checking Explanations , 2020, ACL.

[9]  Sebastian Riedel,et al.  Constructing Datasets for Multi-hop Reading Comprehension Across Documents , 2017, TACL.

[10]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[11]  Isabelle Augenstein,et al.  A Survey on Stance Detection for Mis- and Disinformation Identification , 2021, NAACL-HLT.

[12]  Barbara Plank,et al.  Learning to select data for transfer learning with Bayesian Optimization , 2017, EMNLP.

[13]  Yoshua Bengio,et al.  HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[14]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[15]  Maria Janicka,et al.  GEM: Generative Enhanced Model for adversarial attacks , 2019, EMNLP.

[16]  Chang Zhou,et al.  Cognitive Graph for Multi-Hop Reading Comprehension at Scale , 2019, ACL.

[17]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[18]  Eunsol Choi,et al.  TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[19]  Greg Durrett,et al.  Understanding Dataset Design Choices for Multi-hop Reasoning , 2019, NAACL.

[20]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[21]  Kyunghyun Cho,et al.  SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine , 2017, ArXiv.

[22]  Paul N. Bennett,et al.  Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention , 2020, ICLR.

[23]  Mohit Bansal,et al.  Revealing the Importance of Semantic Retrieval for Machine Reading at Scale , 2019, EMNLP.

[24]  Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers) , 2019, NAACL-HLT.

[25]  Christos Christodoulopoulos,et al.  The FEVER2.0 Shared Task , 2019, EMNLP.

[26]  Richard Socher,et al.  Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering , 2019, ICLR.