A Novel Approach Towards Fake News Detection: Deep Learning Augmented with Textual Entailment Features

The phenomenal growth in web information has nourished research endeavours for automatic fact checking, or fake news and/or misinformation detection. This is one of the very emerging and challenging problems in Natural Language Processing (NLP), Machine Learning (ML) and Data Science. One such problem relates to estimating the veracity of a news story, which is a complex and deep problem. The very recently released Fake News Challenge Stage 1 (FNC-1) dataset introduced the benchmark FNC stage-1: stance detection task. This task could be an effective first step towards building a robust fact checking system. In this paper, we correlate this stance detection problem with Textual Entailment (TE). We present the systems which are based on statistical machine learning (ML), Deep Learning (DL), and a combination of both. Empirical evaluation shows encouraging performance, outperforming the state-of-the-art system.

[1]  Pushpak Bhattacharyya,et al.  Document Level Novelty Detection: Textual Entailment Lends a Helping Hand , 2017, ICON.

[2]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[3]  Andreas Vlachos,et al.  Emergent: a novel data-set for stance classification , 2016, NAACL.

[4]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[5]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[6]  Nan Hua,et al.  Universal Sentence Encoder for English , 2018, EMNLP.

[7]  Christopher D. Manning,et al.  Natural Logic for Textual Inference , 2007, ACL-PASCAL@ACL.

[8]  Zhen-Hua Ling,et al.  Neural Natural Language Inference Models Enhanced with External Knowledge , 2017, ACL.

[9]  Verónica Pérez-Rosas,et al.  Automatic Detection of Fake News , 2017, COLING.

[10]  Christopher D. Manning,et al.  Learning to recognize features of valid textual entailments , 2006, NAACL.

[11]  Guodong Zhou,et al.  Stance Detection with Hierarchical Attention Network , 2018, COLING.

[12]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[13]  Johan Bollen,et al.  Computational Fact Checking from Knowledge Networks , 2015, PloS one.

[14]  Walid Magdy,et al.  Improved Stance Prediction in a User Similarity Feature Space , 2017, ASONAM.

[15]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[16]  Kalina Bontcheva,et al.  Stance Detection with Bidirectional Conditional Encoding , 2016, EMNLP.

[17]  Andreas Vlachos,et al.  Fact Checking: Task definition and dataset construction , 2014, LTCSS@ACL.

[18]  Leila Maria Garcia Fonseca,et al.  Classifying Grasslands and Cultivated Pastures in the Brazilian Cerrado Using Support Vector Machines, Multilayer Perceptrons and Autoencoders , 2015, MLDM.

[19]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[20]  Zhen-Hua Ling,et al.  Enhanced LSTM for Natural Language Inference , 2016, ACL.

[21]  Gonzalo Joya Caparrós,et al.  Saccadic Points Classification Using Multilayer Perceptron and Random Forest Classifiers in EOG Recordings of Patients with Ataxia SCA2 , 2013, IWANN.

[22]  Preslav Nakov,et al.  Automatic Stance Detection Using End-to-End Memory Networks , 2018, NAACL.

[23]  Isabelle Augenstein,et al.  A simple but tough-to-beat baseline for the Fake News Challenge stance detection task , 2017, ArXiv.

[24]  Iryna Gurevych,et al.  A Retrospective Analysis of the Fake News Challenge Stance-Detection Task , 2018, COLING.

[25]  William Yang Wang “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection , 2017, ACL.

[26]  Andreas Vlachos,et al.  Fake news stance detection using stacked ensemble of classifiers , 2017, NLPmJ@EMNLP.