The IPIPAN Team Participation in the Check-Worthiness Task of the CLEF2019 CheckThat! Lab

This paper describes the participation of the IPIPAN team at the CLEF-2019 CheckThat! Lab focused on automatic identification and verification of claims. We participated in Task 1 oriented on assessing the check-worthiness of claims in political debate by identifying and ranking, which sentences should be prioritized for fact-checking. We proposed a logistic regression-based classifier using features such as vector representation of sentences, Part-of-Speech (POS) tags, named entities, and sentiment scores. In the official evaluation, our best performing run was ranked 3 out of 12 teams.

[1]  Andreas Vlachos,et al.  Fact Checking: Task definition and dataset construction , 2014, LTCSS@ACL.

[2]  Cong Yu,et al.  Computational Journalism: A Call to Arms to Database Researchers , 2011, CIDR.

[3]  Eunsol Choi,et al.  Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking , 2017, EMNLP.

[4]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[5]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[6]  Chengkai Li,et al.  Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster , 2017, KDD.

[7]  William Yang Wang “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection , 2017, ACL.

[8]  Preslav Nakov,et al.  Overview of the CLEF-2019 CheckThat! Lab: Automatic Identification and Verification of Claims. Task 1: Check-Worthiness , 2019, CLEF.

[9]  Preslav Nakov,et al.  ClaimRank: Detecting Check-Worthy Claims in Arabic and English , 2018, NAACL.

[10]  Preslav Nakov,et al.  Overview of the CLEF-2019 CheckThat! Lab: Automatic Identification and Verification of Claims. Task 2: Evidence and Factuality , 2019, CLEF.

[11]  Preslav Nakov,et al.  CheckThat! at CLEF 2019: Automatic Identification and Verification of Claims , 2019, ECIR.

[12]  Wotao Yin,et al.  A Fast Hybrid Algorithm for Large-Scale l1-Regularized Logistic Regression , 2010, J. Mach. Learn. Res..

[13]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[14]  Andreas Vlachos,et al.  Automated Fact Checking: Task Formulations, Methods and Future Directions , 2018, COLING.

[15]  Dario Taraborelli,et al.  Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia's Verifiability , 2019, WWW.

[16]  Arkaitz Zubiaga,et al.  Towards Automated Factchecking: Developing an Annotation Schema and Benchmark for Consistent Automated Claim Detection , 2018, ArXiv.

[17]  Chengkai Li,et al.  Detecting Check-worthy Factual Claims in Presidential Debates , 2015, CIKM.