The CLEF-2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News

We describe the fourth edition of the CheckThat! Lab, part of the 2021 Cross-Language Evaluation Forum (CLEF). The lab evaluates technology supporting various tasks related to factuality, and it is offered in Arabic, Bulgarian, English, and Spanish. Task 1 asks to predict which tweets in a Twitter stream are worth fact-checking (focusing on COVID-19). Task 2 asks to determine whether a claim in a tweet can be verified using a set of previously fact-checked claims. Task 3 asks to predict the veracity of a target news article and its topical domain. The evaluation is carried out using mean average precision or precision at rank k for the ranking tasks, and F1 for the classification tasks. © 2021, Springer Nature Switzerland AG.

[1]  Naeemul Hassan,et al.  Comparing Automated Factual Claim Detection Against Judgments of Journalism Organizations , 2016 .

[2]  Preslav Nakov,et al.  Overview of CheckThat! 2020 English: Automatic Identification and Verification of Claims in Social Media , 2020, CLEF.

[3]  Gautam Kishore Shahi AMUSED: An Annotation Framework of Multi-modal Social Media Data , 2020, ArXiv.

[4]  Saif Mohammad,et al.  SemEval-2016 Task 6: Detecting Stance in Tweets , 2016, *SEMEVAL.

[5]  Ritwik Banerjee,et al.  A Hybrid Recognition System for Check-worthy Claims Using Heuristics and Supervised Learning , 2018, CLEF.

[6]  BeaSku at CheckThat! 2021: Fine-Tuning Sentence BERT with Triplet Loss and Limited Data , 2021, CLEF.

[7]  Angel Felipe Magnossão de Paula,et al.  UPV at CheckThat! 2021: Mitigating Cultural Differences for Identifying Multilingual Check-worthy Claims , 2021, CLEF.

[8]  Udo Kruschwitz,et al.  University of Regensburg at CheckThat! 2021: Exploring Text Summarization for Fake News Detection , 2021, CLEF.

[9]  Preslav Nakov,et al.  CheckThat! at CLEF 2019: Automatic Identification and Verification of Claims , 2019, ECIR.

[10]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[11]  AraFacts: The First Large Arabic Dataset of Naturally Occurring Claims , 2021, WANLP.

[12]  Preslav Nakov,et al.  Overview of CheckThat! 2020i Arabic: Automatic Identification and Verification of Claims in Social Media , 2020, CLEF.

[13]  Suhang Wang,et al.  Fake News Detection on Social Media: A Data Mining Perspective , 2017, SKDD.

[14]  Preslav Nakov,et al.  Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society , 2021, EMNLP.

[15]  Jing Qian,et al.  A Survey on Natural Language Processing for Fake News Detection , 2018, LREC.

[16]  Amani S. Abumansour,et al.  QMUL-SDS at CheckThat! 2021: Enriching Pre-Trained Language Models for the Estimation of Check-Worthiness of Arabic Tweets , 2021, CLEF.

[17]  Durgesh Nandini,et al.  FakeCovid - A Multilingual Cross-domain Fact Check News Dataset for COVID-19 , 2020, ICWSM Workshops.

[18]  Qiaozhu Mei,et al.  Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts , 2015, WWW.

[19]  Preslav Nakov,et al.  Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 2: Factuality , 2018, CLEF.

[20]  Laure Berti-Équille,et al.  VERA: A Platform for Veracity Estimation over Web Data , 2016, WWW.

[21]  Juan Martínez-Romo,et al.  NLP&IR@UNED at CheckThat! 2020: A Preliminary Approach for Check-Worthiness and Claim Retrieval Tasks using Neural Networks and Graphs , 2020, CLEF.

[22]  Iryna Gurevych,et al.  A Retrospective Analysis of the Fake News Challenge Stance-Detection Task , 2018, COLING.

[23]  Eneko Agirre,et al.  SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation , 2016, *SEMEVAL.

[24]  Qword at CheckThat! 2021: An Extreme Gradient Boosting Approach for Multiclass Fake News Detection , 2021, CLEF.

[25]  Preslav Nakov,et al.  SemEval-2016 Task 3: Community Question Answering , 2019, *SEMEVAL.

[26]  Mucahid Kutlu,et al.  TrClaim-19: The First Collection for Turkish Check-Worthy Claim Detection with Annotator Rationales , 2020, CONLL.

[27]  Chengkai Li,et al.  ClaimBuster: The First-ever End-to-end Fact-checking System , 2017, Proc. VLDB Endow..

[28]  Preslav Nakov,et al.  Fully Automated Fact Checking Using External Sources , 2017, RANLP.

[29]  Preslav Nakov,et al.  That is a Known Lie: Detecting Previously Fact-Checked Claims , 2020, ACL.

[30]  E. Atwell,et al.  SCUoL at CheckThat! 2021: An AraBERT Model for Check-Worthiness of Arabic Tweets , 2021, CLEF.

[31]  Tim A. Majchrzak,et al.  An Exploratory Study of COVID-19 Misinformation on Twitter , 2020, ArXiv.

[32]  Eric Gilbert,et al.  CREDBANK: A Large-Scale Social Media Corpus With Associated Credibility Annotations , 2015, ICWSM.

[33]  Juan R. Martinez-Rico,et al.  NLP&IR@UNED at CheckThat! 2021: Check-worthiness estimation and fake news detection using transformer models , 2021, CLEF.

[34]  Arkaitz Zubiaga,et al.  Analysing How People Orient to and Spread Rumours in Social Media by Looking at Conversational Threads , 2015, PloS one.

[35]  Sushma Kumari,et al.  NoFake at CheckThat!2021: Fake News Detection Using BERT , 2021, ArXiv.

[36]  Preslav Nakov,et al.  Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 1: Check-Worthiness , 2018, CLEF.

[37]  Paul Rodrigues,et al.  Accenture at CheckThat! 2021: Interesting claim identification and ranking with contextually sensitive lexical training data augmentation , 2021, CLEF.

[38]  Preslav Nakov,et al.  SemEval-2019 Task 8: Fact Checking in Community Question Answering Forums , 2019, *SEMEVAL.

[39]  Preslav Nakov,et al.  Overview of the CLEF-2019 CheckThat! Lab: Automatic Identification and Verification of Claims. Task 2: Evidence and Factuality , 2019, CLEF.

[40]  Ponnurangam Kumaraguru,et al.  TweetCred: Real-Time Credibility Assessment of Content on Twitter , 2014, SocInfo.

[41]  Preslav Nakov,et al.  DIPS at CheckThat! 2021: Verified Claim Retrieval , 2021, CLEF.

[42]  Preslav Nakov,et al.  It Takes Nine to Smell a Rat: Neural Multi-Task Learning for Check-Worthiness Prediction , 2019, RANLP.

[43]  Paul Rodrigues,et al.  Accenture at CheckThat! 2020: If you say so: Post-hoc fact-checking of Claims using Transformer-based Models , 2020, CLEF.

[44]  Preslav Nakov,et al.  Overview of the CLEF-2019 CheckThat! Lab: Automatic Identification and Verification of Claims. Task 1: Check-Worthiness , 2019, CLEF.

[45]  Christian Hansen,et al.  Neural Weakly Supervised Fact Check-Worthiness Detection with Contrastive Sampling-Based Ranking Loss , 2019, CLEF.

[46]  Albert Pritzkau NLytics at CheckThat! 2021: Check-Worthiness Estimation as a Regression Problem on Transformers , 2021, CLEF.

[47]  William Kana Tsoplefack Classifier for fake news detection and Topical Domain of News Articles , 2021, CLEF.

[48]  Maram Hasanain,et al.  bigIR at CheckThat! 2020: Multilingual BERT for Ranking Arabic Tweets by Check-worthiness , 2020, CLEF.

[49]  Preslav Nakov,et al.  Team Alex at CLEF CheckThat! 2020: Identifying Check-Worthy Tweets With Transformer Models , 2020, CLEF.

[50]  Ralph Ewerth,et al.  Check_square at CheckThat! 2020 Claim Detection in Social Media via Fusion of Transformer and Syntactic Features , 2020, CLEF.

[51]  Ioana Manolescu,et al.  A Content Management Perspective on Fact-Checking , 2018, WWW.

[52]  Preslav Nakov,et al.  A Context-Aware Approach for Detecting Worth-Checking Claims in Political Debates , 2017, RANLP.

[53]  Thomas Mandl,et al.  Overview of the CLEF-2021 CheckThat! Lab: Task 3 on Fake News Detection , 2021, CLEF.

[54]  Chengkai Li,et al.  Detecting Check-worthy Factual Claims in Presidential Debates , 2015, CIKM.

[55]  Firoj Alam,et al.  Overview of the CLEF-2021 CheckThat! Lab Task 2 on Detecting Previously Fact-Checked Claims in Tweets and Political Debates , 2021, CLEF.

[56]  Patrick J. Neumann,et al.  A Framework for Argument Retrieval Ranking Argument Clusters by Frequency and Specificity , 2020 .

[57]  Wei Gao,et al.  Detecting Rumors from Microblogs with Recurrent Neural Networks , 2016, IJCAI.

[58]  Jakob Grue Simonsen,et al.  The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab , 2018, CLEF.

[59]  Preslav Nakov,et al.  FANG: Leveraging Social Context for Fake News Detection Using Graph Representation , 2020, CIKM.

[60]  Giovanni Da San Martino,et al.  SemEval-2020 Task 11: Detection of Propaganda Techniques in News Articles , 2020, SEMEVAL.

[61]  Preslav Nakov,et al.  Overview of CheckThat 2020: Automatic Identification and Verification of Claims in Social Media , 2020, CLEF.

[62]  Paolo Papotti,et al.  Automated Fact-Checking for Assisting Human Fact-Checkers , 2021, IJCAI.

[63]  Giovanni Da San Martino,et al.  Overview of the CLEF-2019 CheckThat!: Automatic Identification and Verification of Claims , 2021, ArXiv.

[64]  Paolo Rosso,et al.  UPV-UMA at CheckThat! Lab: Verifying Arabic Claims using a Cross Lingual Approach , 2019, CLEF.

[65]  Adrian Iftene,et al.  UAICS at CheckThat! 2021: Fake news detection , 2021, CLEF.

[66]  MUCIC at CheckThat! 2021: FaDo-Fake News Detection and Domain Identification using Transformers Ensembling , 2021, CLEF.

[67]  TOBB ETU at CheckThat! 2021: Data Engineering for Detecting Check-Worthy Claims , 2021, CLEF.

[68]  A.K.a b Madasamy,et al.  NITK_NLP at CheckThat! 2021: Ensemble Transformer Model for Fake News Classification , 2021, CLEF.

[69]  Gerhard Weikum,et al.  Credibility Assessment of Textual Claims on the Web , 2016, CIKM.

[70]  Preslav Nakov,et al.  Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective and a Call to Arms , 2020, ICWSM.

[71]  Albert Pritzkau NLytics at CheckThat! 2021: Multi-class fake news detection of news articles and domain identification with RoBERTa - a baseline model , 2021, CLEF.

[72]  Preslav Nakov,et al.  What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context , 2020, ACL.

[73]  Stefan Dietze,et al.  ClaimsKG: A Knowledge Graph of Fact-Checked Claims , 2019, SEMWEB.

[74]  CIVIC-UPM at CheckThat! 2021: Integration of Transformers in Misinformation Detection and Topic Classification , 2021, CLEF.

[75]  Gerhard Weikum,et al.  Leveraging Joint Interactions for Credibility Analysis in News Communities , 2015, CIKM.