MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims

We contribute the largest publicly available dataset of naturally occurring factual claims for the purpose of automatic claim verification. It is collected from 26 fact checking websites in English, paired with textual sources and rich metadata, and labelled for veracity by human expert journalists. We present an in-depth analysis of the dataset, highlighting characteristics and challenges. Further, we present results for automatic veracity prediction, both with established baselines and with a novel method for joint ranking of evidence pages and predicting veracity that outperforms all baselines. Significant performance increases are achieved by encoding evidence, and by modelling metadata. Our best-performing model achieves a Macro F1 of 49.2%, showing that this is a challenging testbed for claim veracity prediction.

[1]  William Yang Wang “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection , 2017, ACL.

[2]  Jiliang Tang,et al.  Multi-Source Multi-Class Fake News Detection , 2018, COLING.

[3]  Arkaitz Zubiaga,et al.  Discourse-aware rumour stance classification in social media using sequential classifiers , 2017, Inf. Process. Manag..

[4]  Preslav Nakov,et al.  Fact Checking in Community Forums , 2018, AAAI.

[5]  Samhaa R. El-Beltagy,et al.  NileTMRG at SemEval-2017 Task 8: Determining Rumour and Veracity Support for Rumours on Twitter. , 2017, *SEMEVAL.

[6]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[7]  Arkaitz Zubiaga,et al.  SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours , 2017, *SEMEVAL.

[8]  Jake Ryland Williams,et al.  BuzzFace: A News Veracity Dataset with Facebook User Commentary and Egos , 2018, ICWSM.

[9]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[10]  Huan Liu,et al.  FakeNewsNet: A Data Repository with News Content, Social Context and Dynamic Information for Studying Fake News on Social Media , 2018, ArXiv.

[11]  Kalina Bontcheva,et al.  Stance Detection with Bidirectional Conditional Encoding , 2016, EMNLP.

[12]  Sinan Aral,et al.  The spread of true and false news online , 2018, Science.

[13]  Percy Liang,et al.  Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.

[14]  Dan Roth,et al.  TwoWingOS: A Two-Wing Optimization Strategy for Evidential Claim Verification , 2018, EMNLP.

[15]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[16]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[17]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[18]  Rich Caruana,et al.  Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.

[19]  Kalina Bontcheva,et al.  USFD at SemEval-2016 Task 6: Any-Target Stance Detection on Twitter with Autoencoders , 2016, *SEMEVAL.

[20]  Preslav Nakov,et al.  Integrating Stance Detection and Fact Checking in a Unified Corpus , 2018, NAACL.

[21]  Isabelle Augenstein,et al.  A simple but tough-to-beat baseline for the Fake News Challenge stance detection task , 2017, ArXiv.

[22]  Chris Callison-Burch,et al.  Seeing Things from a Different Angle:Discovering Diverse Perspectives about Claims , 2019, NAACL.

[23]  Verónica Pérez-Rosas,et al.  Automatic Detection of Fake News , 2017, COLING.

[24]  Preslav Nakov,et al.  We Built a Fake News / Click Bait Filter: What Happened Next Will Blow Your Mind! , 2017, RANLP.

[25]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[26]  Gerhard Weikum,et al.  Credibility Assessment of Textual Claims on the Web , 2016, CIKM.

[27]  Victoria L. Rubin,et al.  Fake News or Truth? Using Satirical Cues to Detect Potentially Misleading News , 2016 .

[28]  Andreas Vlachos,et al.  Emergent: a novel data-set for stance classification , 2016, NAACL.

[29]  Isabelle Augenstein,et al.  Turing at SemEval-2017 Task 8: Sequential Approach to Rumour Stance Classification with Branch-LSTM , 2017, *SEMEVAL.

[30]  Preslav Nakov,et al.  Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 1: Check-Worthiness , 2018, CLEF.

[31]  Eric Gilbert,et al.  CREDBANK: A Large-Scale Social Media Corpus With Associated Credibility Annotations , 2015, ICWSM.

[32]  Jakob Grue Simonsen,et al.  The Copenhagen Team Participation in the Factuality Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 Fact Checking Lab , 2018, CLEF.

[33]  Isabelle Augenstein,et al.  Multi-Task Learning of Pairwise Sequence Classification Tasks over Disparate Label Spaces , 2018, NAACL.

[34]  Eugenio Tacchini,et al.  Some Like it Hoax: Automated Fake News Detection in Social Networks , 2017, ArXiv.

[35]  Omer Levy,et al.  GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.

[36]  Johan Bollen,et al.  Computational Fact Checking from Knowledge Networks , 2015, PloS one.

[37]  Thomas Hofmann,et al.  End-to-End Neural Entity Linking , 2018, CoNLL.

[38]  Arkaitz Zubiaga,et al.  Analysing How People Orient to and Spread Rumours in Social Media by Looking at Conversational Threads , 2015, PloS one.

[39]  Carlo Strapparava,et al.  The Lie Detector: Explorations in the Automatic Recognition of Deceptive Language , 2009, ACL.

[40]  Eileen Fitzpatrick,et al.  Verification and Implementation of Language-Based Deception Indicators in Civil and Criminal Narratives , 2008, COLING.

[41]  Andreas Vlachos,et al.  Fact Checking: Task definition and dataset construction , 2014, LTCSS@ACL.