论文信息 - DeFactoNLP: Fact Verification using Entity Recognition, TFIDF Vector Comparison and Decomposable Attention

DeFactoNLP: Fact Verification using Entity Recognition, TFIDF Vector Comparison and Decomposable Attention

In this paper, we describe DeFactoNLP, the system we designed for the FEVER 2018 Shared Task. The aim of this task was to conceive a system that can not only automatically assess the veracity of a claim but also retrieve evidence supporting this assessment from Wikipedia. In our approach, the Wikipedia documents whose Term Frequency-Inverse Document Frequency (TFIDF) vectors are most similar to the vector of the claim and those documents whose names are similar to those of the named entities (NEs) mentioned in the claim are identified as the documents which might contain evidence. The sentences in these documents are then supplied to a textual entailment recognition module. This module calculates the probability of each sentence supporting the claim, contradicting the claim or not providing any relevant information to assess the veracity of the claim. Various features computed using these probabilities are finally used by a Random Forest classifier to determine the overall truthfulness of the claim. The sentences which support this classification are returned as evidence. Our approach achieved a 0.4277 evidence F1-score, a 0.5136 label accuracy and a 0.3833 FEVER score.

Diego Esteves | Gil Rocha | Aniketh Janardhan Reddy | Diego Esteves | Gil Rocha

[1] Jens Lehmann,et al. Toward Veracity Assessment in RDF Knowledge Bases , 2018, ACM J. Data Inf. Qual..

[2] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[3] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.

[4] Haibo He,et al. Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[5] Jakob Uszkoreit,et al. A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[6] Andreas Vlachos,et al. Automated Fact Checking: Task Formulations, Methods and Future Directions , 2018, COLING.

[7] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[8] Christopher D. Manning,et al. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[9] Jason Weston,et al. Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[10] Jens Lehmann,et al. DeFacto - Temporal and multilingual Deep Fact Validation , 2015, J. Web Semant..

[11] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[12] Andreas Vlachos,et al. FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[13] Graeme Hirst,et al. Recognizing Textual Entailment , 2012 .

[14] Lorien Y. Pratt,et al. A Survey of Transfer Between Connectionist Networks , 1996, Connect. Sci..