Combating Fake News: A Survey on Identification and Mitigation Techniques

The proliferation of fake news on social media has opened up new directions of research for timely identification and containment of fake news, and mitigation of its widespread impact on public opinion. While much of the earlier research was focused on identification of fake news based on its contents or by exploiting users' engagements with the news on social media, there has been a rising interest in proactive intervention strategies to counter the spread of misinformation and its impact on society. In this survey, we describe the modern-day problem of fake news and, in particular, highlight the technical challenges associated with it. We discuss existing methods and techniques applicable to both identification and mitigation, with a focus on the significant advances in each method and their advantages and limitations. In addition, research has often been limited by the quality of existing datasets and their specific application contexts. To alleviate this problem, we comprehensively compile and summarize characteristic features of available datasets. Furthermore, we outline new directions of research to facilitate future development of effective and interdisciplinary solutions.

[1]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[2]  Victoria L. Rubin,et al.  Fake News or Truth? Using Satirical Cues to Detect Potentially Misleading News , 2016 .

[3]  Eugenio Tacchini,et al.  Some Like it Hoax: Automated Fake News Detection in Social Networks , 2017, ArXiv.

[4]  Dragomir R. Radev,et al.  Rumor has it: Identifying Misinformation in Microblogs , 2011, EMNLP.

[5]  David F. Larcker,et al.  Detecting Deceptive Discussions in Conference Calls , 2012 .

[6]  Jun Zhang,et al.  Call Attention to Rumors: Deep Attention Based Recurrent Neural Networks for Early Rumor Detection , 2017, ArXiv.

[7]  M. Gentzkow,et al.  Social Media and Fake News in the 2016 Election , 2017 .

[8]  Sibel Adali,et al.  This Just In: Fake News Packs a Lot in Title, Uses Simpler, Repetitive Content in Text Body, More Similar to Satire than Real News , 2017, Proceedings of the International AAAI Conference on Web and Social Media.

[9]  Alfred Hermida,et al.  TWEETS AND TRUTH , 2012 .

[10]  Shimon Kogan,et al.  Fake News: Evidence from Financial Markets , 2019 .

[11]  G. Caldarelli,et al.  The spreading of misinformation online , 2016, Proceedings of the National Academy of Sciences.

[12]  Kathleen Higgins,et al.  Post-truth: a guide for the perplexed , 2016, Nature.

[13]  Kenny Q. Zhu,et al.  False rumors detection on Sina Weibo by propagation structures , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[14]  Christopher Meek,et al.  Semantic Parsing for Single-Relation Question Answering , 2014, ACL.

[15]  Jon Kleinberg,et al.  Maximizing the spread of influence through a social network , 2003, KDD '03.

[16]  Yejin Choi,et al.  Syntactic Stylometry for Deception Detection , 2012, ACL.

[17]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[18]  Mojtaba Vahidi-Asl,et al.  Learn to Detect Phishing Scams Using Learning and Ensemble ?Methods , 2007, 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops.

[19]  Emilio Ferrara,et al.  Social Bots Distort the 2016 US Presidential Election Online Discussion , 2016, First Monday.

[20]  Gerhard Weikum,et al.  Leveraging Joint Interactions for Credibility Analysis in News Communities , 2015, CIKM.

[21]  Qiaozhu Mei,et al.  Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts , 2015, WWW.

[22]  Wei Gao,et al.  Detect Rumor and Stance Jointly by Neural Multi-task Learning , 2018, WWW.

[23]  Verónica Pérez-Rosas,et al.  Automatic Detection of Fake News , 2017, COLING.

[24]  Filippo Menczer,et al.  BotOrNot: A System to Evaluate Social Bots , 2016, WWW.

[25]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[26]  Aldert Vrij,et al.  Scientific Content Analysis (SCAN) Cannot Distinguish Between Truthful and Fabricated Accounts of a Negative Event , 2016, Front. Psychol..

[27]  S. van der Linden,et al.  The fake news game: actively inoculating against the risk of misinformation , 2019 .

[28]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[29]  Jure Leskovec,et al.  Disinformation on the Web: Impact, Characteristics, and Detection of Wikipedia Hoaxes , 2016, WWW.

[30]  Eunsol Choi,et al.  Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking , 2017, EMNLP.

[31]  Scott Counts,et al.  Tweeting is believing?: understanding microblog credibility perceptions , 2012, CSCW.

[32]  Divyakant Agrawal,et al.  Limiting the spread of misinformation in social networks , 2011, WWW.

[33]  Matthew Louis Mauriello,et al.  Fake News vs Satire: A Dataset and Analysis , 2018, WebSci.

[34]  Craig L. Silverman,et al.  Lies, Damn Lies and Viral Content , 2015 .

[35]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[36]  Jessikka Aro,et al.  The cyberspace war: propaganda and trolling as warfare tools , 2016 .

[37]  Galit Nahari,et al.  Does the Truth Come Out in the Writing? SCAN as a Lie Detection Tool , 2011 .

[38]  Svitlana Volkova,et al.  Misleading or Falsification: Inferring Deceptive Strategies and Types in Online News and Social Media , 2018, WWW.

[39]  Savvas Zannettou,et al.  A pr 2 01 8 The Web of False Information : Rumors , Fake News , Hoaxes , Clickbait , and Various Other Shenanigans , 2018 .

[40]  J. Nunamaker,et al.  Automating Linguistics-Based Cues for Detecting Deception in Text-Based Asynchronous Computer-Mediated Communications , 2004 .

[41]  Sebastian Tschiatschek,et al.  Detecting Fake News in Social Networks via Crowdsourcing , 2017, ArXiv.

[42]  Amos Azaria,et al.  The DARPA Twitter Bot Challenge , 2016, Computer.

[43]  Benno Stein,et al.  A Stylometric Inquiry into Hyperpartisan and Fake News , 2017, ACL.

[44]  Soroush Vosoughi,et al.  Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning , 2017, ACL.

[45]  Jimeng Sun,et al.  Fast Random Walk Graph Kernel , 2012, SDM.

[46]  Ponnurangam Kumaraguru,et al.  TweetCred: Real-Time Credibility Assessment of Content on Twitter , 2014, SocInfo.

[47]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[48]  P. Metaxas,et al.  The infamous #Pizzagate conspiracy theory: Insight from a TwitterTrails investigation , 2017 .

[49]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[50]  Yiangos Papanastasiou Fake News Propagation and Detection: A Sequential Model , 2020, Manag. Sci..

[51]  Gerhard Weikum,et al.  Where the Truth Lies: Explaining the Credibility of Emerging Claims on the Web and Social Media , 2017, WWW.

[52]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[53]  Nam P. Nguyen,et al.  Containment of misinformation spread in online social networks , 2012, WebSci '12.

[54]  Arkaitz Zubiaga,et al.  Detection and Resolution of Rumours in Social Media , 2017, ACM Comput. Surv..

[55]  Arkaitz Zubiaga,et al.  Analysing How People Orient to and Spread Rumours in Social Media by Looking at Conversational Threads , 2015, PloS one.

[56]  Alessandro Moschitti,et al.  Efficient Convolution Kernels for Dependency and Constituent Syntactic Trees , 2006, ECML.

[57]  Le Song,et al.  Learning Social Infectivity in Sparse Low-rank Networks Using Multi-dimensional Hawkes Processes , 2013, AISTATS.

[58]  Katie Raymer,et al.  Online Information and Fake News , 2017 .

[59]  Fan Yang,et al.  Automatic detection of rumor on Sina Weibo , 2012, MDS '12.

[60]  Carlo Strapparava,et al.  The Lie Detector: Explorations in the Automatic Recognition of Deceptive Language , 2009, ACL.

[61]  Wei Gao,et al.  Detecting Rumors from Microblogs with Recurrent Neural Networks , 2016, IJCAI.

[62]  Geoffrey Leech,et al.  Grammatical word class variation within the British National Corpus sampler , 2002 .

[63]  Nicole A. Cooke Posttruth, Truthiness, and Alternative Facts: Information Behavior and Critical Information Consumption for a New Age , 2017, The Library Quarterly.

[64]  Emory Paine,et al.  The Next Step: Social Media and the Evolution of Journalism , 2015 .

[65]  J. Burgoon,et al.  Interpersonal Deception Theory , 1996 .

[66]  Misako Takayasu,et al.  Rumor Diffusion and Convergence during the 3.11 Earthquake: A Twitter Case Study , 2015, PloS one.

[67]  Claire Cardie,et al.  Negative Deceptive Opinion Spam , 2013, NAACL.

[68]  Wei Gao,et al.  Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning , 2017, ACL.

[69]  Sungyong Seo,et al.  CSI: A Hybrid Deep Model for Fake News Detection , 2017, CIKM.

[70]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[71]  Huan Liu,et al.  FakeNewsNet: A Data Repository with News Content, Social Context and Dynamic Information for Studying Fake News on Social Media , 2018, ArXiv.

[72]  Soroush Vosoughi,et al.  Me, My Echo Chamber, and I: Introspection on Social Media Polarization , 2018, WWW.

[73]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[74]  Masahiro Kimura,et al.  Efficient Estimation of Influence Functions for SIS Model on Social Networks , 2009, IJCAI.

[75]  Anupam Joshi,et al.  Faking Sandy: characterizing and identifying fake images on Twitter during Hurricane Sandy , 2013, WWW.

[76]  Yevgeniy Vorobeychik,et al.  Adversarial Classification on Social Networks , 2018, AAMAS.

[77]  Yimin Chen,et al.  Misleading Online Content: Recognizing Clickbait as "False News" , 2015, WMDD@ICMI.

[78]  Le Song,et al.  Fake News Mitigation via Point Process Based Intervention , 2017, ICML.

[79]  Huan Liu,et al.  Tracing Fake-News Footprints: Characterizing Social Media Messages by How They Propagate , 2018, WSDM.

[80]  Wei Gao,et al.  Detect Rumors Using Time Series of Social Context Information on Microblogging Websites , 2015, CIKM.

[81]  David Mandell Freeman,et al.  Can You Spot the Fakes?: On the Limitations of User Feedback in Online Social Networks , 2017, WWW.

[82]  Wei Chen,et al.  Influence Blocking Maximization in Social Networks under the Competitive Linear Threshold Model , 2011, SDM.

[83]  Arkaitz Zubiaga,et al.  All-in-one: Multi-task Learning for Rumour Verification , 2018, COLING.

[84]  Wei Gao,et al.  Rumor Detection on Twitter with Tree-structured Recursive Neural Networks , 2018, ACL.

[85]  David G. Rand,et al.  Who falls for fake news? The roles of bullshit receptivity, overclaiming, familiarity, and analytic thinking. , 2020, Journal of personality.

[86]  Vincenzo Auletta,et al.  Contrasting the Spread of Misinformation in Online Social Networks , 2017, AAMAS.

[87]  Carlos Carvalho,et al.  The Persistent Effects of a False News Shock , 2011 .

[88]  Yan Liu,et al.  Neural User Response Generator: Fake News Detection with Collective User Intelligence , 2018, IJCAI.

[89]  M. Newman Spread of epidemic disease on networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[90]  James F. Roiger,et al.  Testing Interpersonal Deception Theory: The Language of Interpersonal Deception , 1996 .

[91]  Suhang Wang,et al.  Fake News Detection on Social Media: A Data Mining Perspective , 2017, SKDD.

[92]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[93]  Walter Quattrociocchi,et al.  Echo Chambers on Facebook , 2016 .

[94]  Gerhard Weikum,et al.  CredEye: A Credibility Lens for Analyzing and Explaining Misinformation , 2018, WWW.

[95]  Mark Johnson,et al.  PCFG Models of Linguistic Tree Representations , 1998, CL.

[96]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[97]  Aristides Gionis,et al.  Balancing information exposure in social networks , 2017, NIPS.

[98]  M. Cha,et al.  Rumor Detection over Varying Time Windows , 2017, PloS one.

[99]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[100]  Andreas Vlachos,et al.  Fact Checking: Task definition and dataset construction , 2014, LTCSS@ACL.

[101]  Naren Ramakrishnan,et al.  Epidemiological modeling of news and rumors on Twitter , 2013, SNAKDD '13.

[102]  Bernhard Schölkopf,et al.  Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation , 2017, WSDM.

[103]  Stephen Porter,et al.  The language of deceit: An investigation of the verbal clues to deception in the interrogation context , 1996 .

[104]  Timothy R. Levine,et al.  The Language of Truthful and Deceptive Denials and Confessions , 2008 .

[105]  William Yang Wang “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection , 2017, ACL.

[106]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[107]  Svitlana Volkova,et al.  Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter , 2017, ACL.

[108]  U. Undeutsch,et al.  Courtroom evaluation of eyewitness testimony. , 1984 .

[109]  Adrienne Y. Lee,et al.  Language of lies in prison: linguistic classification of prisoners' truthful and deceptive natural language , 2005 .