Learning from Fact-checkers: Analysis and Generation of Fact-checking Language

In fighting against fake news, many fact-checking systems comprised of human-based fact-checking sites (e.g., snopes.com and politifact.com) and automatic detection systems have been developed in recent years. However, online users still keep sharing fake news even when it has been debunked. It means that early fake news detection may be insufficient and we need another complementary approach to mitigate the spread of misinformation. In this paper, we introduce a novel application of text generation for combating fake news. In particular, we (1) leverage online users named fact-checkers, who cite fact-checking sites as credible evidences to fact-check information in public discourse; (2) analyze linguistic characteristics of fact-checking tweets; and (3) propose and build a deep learning framework to generate responses with fact-checking intention to increase the fact-checkers' engagement in fact-checking activities. Our analysis reveals that the fact-checkers tend to refute misinformation and use formal language (e.g. few swear words and Internet slangs). Our framework successfully generates relevant responses, and outperforms competing models by achieving up to 30% improvements. Our qualitative study also confirms that the superiority of our generated responses compared with responses generated from the existing models.

[1]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[2]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[3]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[4]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[5]  Gerhard Weikum,et al.  DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning , 2018, EMNLP.

[6]  Sinan Aral,et al.  The spread of true and false news online , 2018, Science.

[7]  Bernhard Schölkopf,et al.  Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation , 2017, WSDM.

[8]  Qiaozhu Mei,et al.  Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts , 2015, WWW.

[9]  Huan Liu,et al.  Deep Headline Generation for Clickbait Detection , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[10]  Matthew Lease,et al.  An Interpretable Joint Graphical Model for Fact-Checking From Crowds , 2018, AAAI.

[11]  William Yang Wang “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection , 2017, ACL.

[12]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[13]  Svitlana Volkova,et al.  Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter , 2017, ACL.

[14]  Preslav Nakov,et al.  Fact Checking in Community Forums , 2018, AAAI.

[15]  Fenglong Ma,et al.  EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection , 2018, KDD.

[16]  Svetha Venkatesh,et al.  Variational Memory Encoder-Decoder , 2018, NeurIPS.

[17]  Alexander M. Rush,et al.  Character-Aware Neural Language Models , 2015, AAAI.

[18]  Ben Y. Zhao,et al.  Automated Crowdturfing Attacks and Defenses in Online Review Systems , 2017, CCS.

[19]  Yan Liu,et al.  Neural User Response Generator: Fake News Detection with Collective User Intelligence , 2018, IJCAI.

[20]  Robert M. Mason,et al.  Characterizing Online Rumoring Behavior Using Multi-Dimensional Signatures , 2015, CSCW.

[21]  Justin Cheng,et al.  Rumor Cascades , 2014, ICWSM.

[22]  Michael S. Bernstein,et al.  Anyone Can Become a Troll: Causes of Trolling Behavior in Online Discussions , 2017, CSCW.

[23]  Wei Gao,et al.  Detect Rumors Using Time Series of Social Context Information on Microblogging Websites , 2015, CIKM.

[24]  Jie Yang,et al.  LearningQ: A Large-Scale Dataset for Educational Question Generation , 2018, ICWSM.

[25]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[26]  Alice H. Oh,et al.  Homogeneity-Based Transmissive Process to Model True and False News in Social Networks , 2018, WSDM.

[27]  Gerhard Weikum,et al.  Credibility Assessment of Textual Claims on the Web , 2016, CIKM.

[28]  Filippo Menczer,et al.  Hoaxy: A Platform for Tracking Online Misinformation , 2016, WWW.

[29]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[30]  Christo Wilson,et al.  Linguistic Signals under Misinformation and Fact-Checking , 2018, Proc. ACM Hum. Comput. Interact..

[31]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[32]  Ingmar Weber,et al.  Get Back! You Don't Know Me Like That: The Social Mediation of Fact Checking Interventions in Twitter Conversations , 2014, ICWSM.

[33]  Lukás Burget,et al.  Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[34]  Anupam Joshi,et al.  Faking Sandy: characterizing and identifying fake images on Twitter during Hurricane Sandy , 2013, WWW.

[35]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[36]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[37]  Eunsol Choi,et al.  Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking , 2017, EMNLP.

[38]  Fumin Shen,et al.  Chat More: Deepening and Widening the Chatting Topic via A Deep Model , 2018, SIGIR.

[39]  Stephan Lewandowsky,et al.  Explicit warnings reduce but do not eliminate the continued influence of misinformation , 2010, Memory & cognition.

[40]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[41]  Wei Gao,et al.  Detecting Rumors from Microblogs with Recurrent Neural Networks , 2016, IJCAI.

[42]  Filippo Menczer,et al.  Finding Streams in Knowledge Graphs to Support Fact Checking , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[43]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[44]  Dragomir R. Radev,et al.  Rumor has it: Identifying Misinformation in Microblogs , 2011, EMNLP.

[45]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[46]  Ryan L. Boyd,et al.  The Development and Psychometric Properties of LIWC2015 , 2015 .

[47]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[48]  Alan Ritter,et al.  Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.

[49]  Kyomin Jung,et al.  Aspects of Rumor Spreading on a Microblog Network , 2013, SocInfo.

[50]  Kyumin Lee,et al.  The Rise of Guardians: Fact-checking URL Recommendation to Combat Fake News , 2018, SIGIR.

[51]  B. Nyhan,et al.  When Corrections Fail: The Persistence of Political Misperceptions , 2010 .

[52]  Kyumin Lee,et al.  Uncovering social spammers: social honeypots + machine learning , 2010, SIGIR.

[53]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[54]  Sibel Adali,et al.  This Just In: Fake News Packs a Lot in Title, Uses Simpler, Repetitive Content in Text Body, More Similar to Satire than Real News , 2017, Proceedings of the International AAAI Conference on Web and Social Media.

[55]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[56]  Noah A. Smith,et al.  Probabilistic Frame-Semantic Parsing , 2010, NAACL.

[57]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[58]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.