Entity Detection for Check-worthiness Prediction: Glasgow Terrier at CLEF CheckThat! 2019

Since information can be created and shared online by anyone, a lot of time and effort are required to manually fact-check all the information encountered by users everyday. Hence, an automatic factchecking process is needed to effectively fact-check the vast information available online. However, gathering information related to every single claim can also be redundant, as not all sentences or articles are checkworthy. In this paper, we propose an effective approach for retrieving check-worthy sentences within American political debates, which relates to the first task of the CLEF CheckThat! 2019 Lab. To rank sentences based on their check-worthiness, we propose to represent each sentence using their mentioned entities using a TF-IDF representation. We use a SVM classifier to predict the check-worthiness of each sentence. Our approach ranked 4th out of 12 submissions. Our experiments show that the pronouns and coreference resolution pre-processing procedure we use as part of our approach does improve the effectiveness of sentence checkworthiness prediction. Furthermore, our results show that entity analysis features provide valuable evidence for this task.

[1]  Pablo N. Mendes,et al.  Improving efficiency and accuracy in multilingual entity extraction , 2013, I-SEMANTICS '13.

[2]  Luke S. Zettlemoyer,et al.  Higher-Order Coreference Resolution with Coarse-to-Fine Inference , 2018, NAACL.

[3]  Chengkai Li,et al.  ClaimBuster: The First-ever End-to-end Fact-checking System , 2017, Proc. VLDB Endow..

[4]  Preslav Nakov,et al.  ClaimRank: Detecting Check-Worthy Claims in Arabic and English , 2018, NAACL.

[5]  Ritwik Banerjee,et al.  A Hybrid Recognition System for Check-worthy Claims Using Heuristics and Supervised Learning , 2018, CLEF.

[6]  Preslav Nakov,et al.  Overview of the CLEF-2019 CheckThat! Lab: Automatic Identification and Verification of Claims. Task 1: Check-Worthiness , 2019, CLEF.

[7]  Johan Bollen,et al.  Computational Fact Checking from Knowledge Networks , 2015, PloS one.

[8]  Ian H. Witten,et al.  An effective, low-cost measure of semantic relatedness obtained from Wikipedia links , 2008 .

[9]  Ganggao Zhu,et al.  Computing Semantic Similarity of Concepts in Knowledge Graphs , 2017, IEEE Transactions on Knowledge and Data Engineering.

[10]  Jakob Grue Simonsen,et al.  The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identification and Verification of Claims in Political Debates of the CLEF-2018 CheckThat! Lab , 2018, CLEF.

[11]  Carlos Angel Iglesias,et al.  Sematch: Semantic Entity Search from Knowledge Graph , 2015, SumPre-HSWI@ESWC.

[12]  Saurabh Bagchi,et al.  TATHYA: A Multi-Classifier System for Detecting Check-Worthy Statements in Political Debates , 2017, CIKM.

[13]  Preslav Nakov,et al.  A Context-Aware Approach for Detecting Worth-Checking Claims in Political Debates , 2017, RANLP.

[14]  L. Alberto Franco,et al.  Forms of conversation and problem structuring methods: a conceptual development , 2006, J. Oper. Res. Soc..