WhatTheWikiFact: Fact-Checking Claims Against Wikipedia

The rise of Internet has made it a major source of information. Unfortunately, not all information online is true, and thus a number of fact-checking initiatives have been launched, both manual and automatic, to deal with the problem. Here, we present our contribution in this regard: WhatTheWikiFact, a system for automatic claim verification using Wikipedia. The system can predict the veracity of an input claim, and it further shows the evidence it has retrieved as part of the verification process. It shows confidence scores and a list of relevant Wikipedia articles, together with detailed information about each article, including the phrase used to retrieve it, the most relevant sentences extracted from it and their stance with respect to the input claim, as well as the associated probabilities. The system supports several languages: Bulgarian, English, and Russian.

[1]  Giovanni Da San Martino,et al.  Overview of the CLEF-2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News , 2021, CLEF.

[2]  Christos Christodoulopoulos,et al.  FEVEROUS: Fact Extraction and VERification Over Unstructured and Structured information , 2021, NeurIPS Datasets and Benchmarks.

[3]  Madian Khabsa,et al.  Towards Few-shot Fact-Checking via Perplexity , 2021, NAACL.

[4]  Jisun An,et al.  A Survey on Predicting the Factuality and the Bias of News Media , 2021, ArXiv.

[5]  Giovanni Da San Martino,et al.  A Survey on Multimodal Disinformation Detection , 2021, COLING.

[6]  Paolo Papotti,et al.  Automated Fact-Checking for Assisting Human Fact-Checkers , 2021, IJCAI.

[7]  Isabelle Augenstein,et al.  A Survey on Stance Detection for Mis- and Disinformation Identification , 2021, NAACL-HLT.

[8]  Preslav Nakov,et al.  Overview of CheckThat 2020: Automatic Identification and Verification of Claims in Social Media , 2020, CLEF.

[9]  Iryna Gurevych,et al.  AdapterHub: A Framework for Adapting Transformers , 2020, EMNLP.

[10]  Giovanni Da San Martino,et al.  A Survey on Computational Propaganda Detection , 2020, IJCAI.

[11]  Nayeon Lee,et al.  Misinformation Has High Perplexity , 2020, ArXiv.

[12]  Madian Khabsa,et al.  Language Models as Fact Checkers? , 2020, FEVER.

[13]  Preslav Nakov,et al.  That is a Known Lie: Detecting Previously Fact-Checked Claims , 2020, ACL.

[14]  Paolo Papotti,et al.  Scrutinizer , 2020, Proc. VLDB Endow..

[15]  Colin Raffel,et al.  How Much Knowledge Can You Pack into the Parameters of a Language Model? , 2020, EMNLP.

[16]  Preslav Nakov,et al.  CheckThat! at CLEF 2020: Enabling the Automatic Identification and Verification of Claims in Social Media , 2020, ECIR.

[17]  Manuel Palomar,et al.  Team GPLSI. Approach for automated fact checking , 2019, EMNLP.

[18]  Gerhard Weikum,et al.  STANCY: Stance Classification Based on Consistency Cues , 2019, EMNLP.

[19]  Preslav Nakov,et al.  Tanbih: Get To Know What You Are Reading , 2019, EMNLP.

[20]  Wenhu Chen,et al.  TabFact: A Large-scale Dataset for Table-based Fact Verification , 2019, ICLR.

[21]  Sebastian Riedel,et al.  Language Models as Knowledge Bases? , 2019, EMNLP.

[22]  James Glass,et al.  FAKTA: An Automatic End-to-End Fact Checking System , 2019, NAACL.

[23]  Preslav Nakov,et al.  SemEval-2019 Task 8: Fact Checking in Community Question Answering Forums , 2019, *SEMEVAL.

[24]  Gerhard Weikum,et al.  Tracy: Tracing Facts over Knowledge Graphs and Text , 2019, WWW.

[25]  Preslav Nakov,et al.  CheckThat! at CLEF 2019: Automatic Identification and Verification of Claims , 2019, ECIR.

[26]  Christopher Malon Team Papelo: Transformer Networks at FEVER , 2019, FEVER@EMNLP.

[27]  Mohit Bansal,et al.  Combining Fact Extraction and Verification with Neural Semantic Matching Networks , 2018, AAAI.

[28]  Andreas Vlachos,et al.  The Fact Extraction and VERification (FEVER) Shared Task , 2018, FEVER@EMNLP.

[29]  Wolfgang Otto,et al.  Team GESIS Cologne: An all in all sentence-based approach for FEVER , 2018, FEVER@EMNLP.

[30]  Maria Liakata,et al.  SemEval-2019 Task 7: RumourEval, Determining Rumour Veracity and Support for Rumours , 2018, *SEMEVAL.

[31]  Preslav Nakov,et al.  Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 2: Factuality , 2018, CLEF.

[32]  Iryna Gurevych,et al.  UKP-Athene: Multi-Sentence Textual Entailment for Claim Verification , 2018, FEVER@EMNLP.

[33]  Diego Esteves,et al.  DeFactoNLP: Fact Verification using Entity Recognition, TFIDF Vector Comparison and Decomposable Attention , 2018, FEVER@EMNLP.

[34]  Preslav Nakov,et al.  Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 1: Check-Worthiness , 2018, CLEF.

[35]  Andreas Vlachos,et al.  Automated Fact Checking: Task Formulations, Methods and Future Directions , 2018, COLING.

[36]  Iryna Gurevych,et al.  A Retrospective Analysis of the Fake News Challenge Stance-Detection Task , 2018, COLING.

[37]  Ricardo Baeza-Yates,et al.  Bias on the web , 2018, Commun. ACM.

[38]  Mark Hopkins,et al.  Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples , 2018, ACL.

[39]  Gerhard Weikum,et al.  CredEye: A Credibility Lens for Analyzing and Explaining Misinformation , 2018, WWW.

[40]  Savvas Zannettou,et al.  The Web of False Information , 2018, ACM J. Data Inf. Qual..

[41]  Luke S. Zettlemoyer,et al.  AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.

[42]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[43]  Miriam J. Metzger,et al.  The science of fake news , 2018, Science.

[44]  Sinan Aral,et al.  The spread of true and false news online , 2018, Science.

[45]  Eunsol Choi,et al.  Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking , 2017, EMNLP.

[46]  Suhang Wang,et al.  Fake News Detection on Social Media: A Data Mining Perspective , 2017, SKDD.

[47]  Isabelle Augenstein,et al.  A simple but tough-to-beat baseline for the Fake News Challenge stance detection task , 2017, ArXiv.

[48]  Anna Veronika Dorogush,et al.  CatBoost: unbiased boosting with categorical features , 2017, NeurIPS.

[49]  William Yang Wang “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection , 2017, ACL.

[50]  Arkaitz Zubiaga,et al.  SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours , 2017, *SEMEVAL.

[51]  Arkaitz Zubiaga,et al.  Detection and Resolution of Rumours in Social Media , 2017, ACM Comput. Surv..

[52]  Timothy Dozat,et al.  Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.

[53]  Filippo Menczer,et al.  Hoaxy: A Platform for Tracking Online Misinformation , 2016, WWW.

[54]  Bo Zhao,et al.  A Survey on Truth Discovery , 2015, SKDD.

[55]  Filippo Menczer,et al.  The rise of social bots , 2014, Commun. ACM.

[56]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[57]  Preslav Nakov,et al.  Building an inflectional stemmer for Bulgarian , 2003, CompSysTech '03.

[58]  Abbe Mowshowitz,et al.  Bias on the web , 2002, CACM.

[59]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[60]  Giovanni Da San Martino,et al.  Overview of the CLEF-2021 CheckThat! Lab Task 2 on Detecting Previously Fact-Checked Claims in Tweets and Political Debates , 2021, CLEF.

[61]  Alberto Barrón-Cedeño,et al.  The CLEF-2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News , 2021, ECIR.

[62]  Giovanni Da San Martino,et al.  Overview of CheckThat! 2020 English: Automatic Identification and Verification of Claims in Social Media , 2020, CLEF.

[63]  Preslav Nakov,et al.  Overview of CheckThat! 2020i Arabic: Automatic Identification and Verification of Claims in Social Media , 2020, CLEF.

[64]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[65]  Christos Christodoulopoulos,et al.  The FEVER2.0 Shared Task , 2019, EMNLP.

[66]  Dmitry Ilvovsky,et al.  Extract and Aggregate: A Novel Domain-Independent Approach to Factual Data Verification , 2019, EMNLP.

[67]  Dominik Stammbach,et al.  Team DOMLIN: Exploiting Evidence Enhancement for the FEVER Shared Task , 2019, EMNLP.

[68]  Mona T. Diab,et al.  Team SWEEPer: Joint Sentence Extraction and Fact Checking with Pointer Networks , 2018, FEVER@EMNLP.

[69]  Sebastian Riedel,et al.  UCL Machine Reading Group: Four Factor Framework For Fact Finding (HexaF) , 2018, FEVER@EMNLP.

[70]  Smaranda Muresan,et al.  Robust Document Retrieval and Individual Evidence Modeling for Fact Extraction and Verification. , 2018, FEVER@EMNLP.

[71]  Savvas Zannettou,et al.  A pr 2 01 8 The Web of False Information : Rumors , Fake News , Hoaxes , Clickbait , and Various Other Shenanigans , 2018 .

[72]  Steven Bird NLTK: The Natural Language Toolkit , 2006, ACL.

[73]  Preslav Nakov Design and Evaluation of Inflectional Stemmer for Bulgarian Preslav Nakov , 1998 .

[74]  Dragomir R. Radev,et al.  of the Association for Computational Linguistics , 2022 .