Combating Fake News

The proliferation of fake news on social media has opened up new directions of research for timely identification and containment of fake news and mitigation of its widespread impact on public opinion. While much of the earlier research was focused on identification of fake news based on its contents or by exploiting users’ engagements with the news on social media, there has been a rising interest in proactive intervention strategies to counter the spread of misinformation and its impact on society. In this survey, we describe the modern-day problem of fake news and, in particular, highlight the technical challenges associated with it. We discuss existing methods and techniques applicable to both identification and mitigation, with a focus on the significant advances in each method and their advantages and limitations. In addition, research has often been limited by the quality of existing datasets and their specific application contexts. To alleviate this problem, we comprehensively compile and summarize characteristic features of available datasets. Furthermore, we outline new directions of research to facilitate future development of effective and interdisciplinary solutions.

[1]  Svitlana Volkova,et al.  Misleading or Falsification: Inferring Deceptive Strategies and Types in Online News and Social Media , 2018, WWW.

[2]  Huan Liu,et al.  FakeNewsNet: A Data Repository with News Content, Social Context and Dynamic Information for Studying Fake News on Social Media , 2018, ArXiv.

[3]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[4]  Soroush Vosoughi,et al.  Me, My Echo Chamber, and I: Introspection on Social Media Polarization , 2018, WWW.

[5]  James McCloskey,et al.  Subject/Course Guides: Fact Checking & Fake News: False, Misleading, Clickbait-y, and Satirical “News" Sources , 2017 .

[6]  Guodong Zhou,et al.  Exploring syntactic structured features over parse trees for relation extraction using kernel methods , 2008, Inf. Process. Manag..

[7]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[8]  Masahiro Kimura,et al.  Efficient Estimation of Influence Functions for SIS Model on Social Networks , 2009, IJCAI.

[9]  Anupam Joshi,et al.  Faking Sandy: characterizing and identifying fake images on Twitter during Hurricane Sandy , 2013, WWW.

[10]  William Yang Wang “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection , 2017, ACL.

[11]  Jimeng Sun,et al.  Fast Random Walk Graph Kernel , 2012, SDM.

[12]  Ponnurangam Kumaraguru,et al.  TweetCred: Real-Time Credibility Assessment of Content on Twitter , 2014, SocInfo.

[13]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[14]  Walter Quattrociocchi,et al.  Echo Chambers on Facebook , 2016 .

[15]  Le Song,et al.  Learning Social Infectivity in Sparse Low-rank Networks Using Multi-dimensional Hawkes Processes , 2013, AISTATS.

[16]  Gerhard Weikum,et al.  CredEye: A Credibility Lens for Analyzing and Explaining Misinformation , 2018, WWW.

[17]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[18]  Svitlana Volkova,et al.  Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter , 2017, ACL.

[19]  U. Undeutsch,et al.  Courtroom evaluation of eyewitness testimony. , 1984 .

[20]  Adrienne Y. Lee,et al.  Language of lies in prison: linguistic classification of prisoners' truthful and deceptive natural language , 2005 .

[21]  Katie Raymer,et al.  Online Information and Fake News , 2017 .

[22]  Mark Johnson,et al.  PCFG Models of Linguistic Tree Representations , 1998, CL.

[23]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[24]  Emilio Ferrara,et al.  Social Bots Distort the 2016 US Presidential Election Online Discussion , 2016, First Monday.

[25]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[26]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[27]  Carlo Strapparava,et al.  The Lie Detector: Explorations in the Automatic Recognition of Deceptive Language , 2009, ACL.

[28]  Jessikka Aro,et al.  The cyberspace war: propaganda and trolling as warfare tools , 2016 .

[29]  Mojtaba Vahidi-Asl,et al.  Learn to Detect Phishing Scams Using Learning and Ensemble ?Methods , 2007, 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops.

[30]  Qiaozhu Mei,et al.  Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts , 2015, WWW.

[31]  Bernhard Schölkopf,et al.  Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation , 2017, WSDM.

[32]  Stephen Porter,et al.  The language of deceit: An investigation of the verbal clues to deception in the interrogation context , 1996 .

[33]  Yiangos Papanastasiou,et al.  Fake News Propagation and Detection: A Sequential Model , 2018, Manag. Sci..

[34]  Timothy R. Levine,et al.  The Language of Truthful and Deceptive Denials and Confessions , 2008 .

[35]  Regina Barzilay,et al.  Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) , 2017, ACL 2017.

[36]  Yan Liu,et al.  Neural User Response Generator: Fake News Detection with Collective User Intelligence , 2018, IJCAI.

[37]  Wei Gao,et al.  Detect Rumor and Stance Jointly by Neural Multi-task Learning , 2018, WWW.

[38]  M. Newman Spread of epidemic disease on networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[39]  James F. Roiger,et al.  Testing Interpersonal Deception Theory: The Language of Interpersonal Deception , 1996 .

[40]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[41]  Yimin Chen,et al.  Misleading Online Content: Recognizing Clickbait as "False News" , 2015, WMDD@ICMI.

[42]  Sebastian Tschiatschek,et al.  Fake News Detection in Social Networks via Crowd Signals , 2017, WWW.

[43]  Wei Gao,et al.  Detect Rumors Using Time Series of Social Context Information on Microblogging Websites , 2015, CIKM.

[44]  Matthew Louis Mauriello,et al.  Fake News vs Satire: A Dataset and Analysis , 2018, WebSci.

[45]  David F. Larcker,et al.  Detecting Deceptive Discussions in Conference Calls , 2012 .

[46]  Eugenio Tacchini,et al.  Some Like it Hoax: Automated Fake News Detection in Social Networks , 2017, ArXiv.

[47]  Yejin Choi,et al.  Syntactic Stylometry for Deception Detection , 2012, ACL.

[48]  Aristides Gionis,et al.  Balancing information exposure in social networks , 2017, NIPS.

[49]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[50]  Victoria L. Rubin,et al.  Fake News or Truth? Using Satirical Cues to Detect Potentially Misleading News , 2016 .

[51]  Wei Chen,et al.  Influence Blocking Maximization in Social Networks under the Competitive Linear Threshold Model , 2011, SDM.

[52]  Arkaitz Zubiaga,et al.  All-in-one: Multi-task Learning for Rumour Verification , 2018, COLING.

[53]  Wei Gao,et al.  Rumor Detection on Twitter with Tree-structured Recursive Neural Networks , 2018, ACL.

[54]  Jun Zhang,et al.  Call Attention to Rumors: Deep Attention Based Recurrent Neural Networks for Early Rumor Detection , 2017, ArXiv.

[55]  Benno Stein,et al.  A Stylometric Inquiry into Hyperpartisan and Fake News , 2017, ACL.

[56]  Gerhard Weikum,et al.  Where the Truth Lies: Explaining the Credibility of Emerging Claims on the Web and Social Media , 2017, WWW.

[57]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[58]  Sibel Adali,et al.  This Just In: Fake News Packs a Lot in Title, Uses Simpler, Repetitive Content in Text Body, More Similar to Satire than Real News , 2017, Proceedings of the International AAAI Conference on Web and Social Media.

[59]  Kathleen Higgins,et al.  Post-truth: a guide for the perplexed , 2016, Nature.

[60]  Justin Cheng,et al.  Rumor Cascades , 2014, ICWSM.

[61]  Claire Cardie,et al.  Negative Deceptive Opinion Spam , 2013, NAACL.

[62]  Sungyong Seo,et al.  CSI: A Hybrid Deep Model for Fake News Detection , 2017, CIKM.

[63]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[64]  D. Larcker,et al.  Detecting Deceptive Discussions in Conference Calls , 2012 .

[65]  Fan Yang,et al.  Automatic detection of rumor on Sina Weibo , 2012, MDS '12.

[66]  Aldert Vrij,et al.  Scientific Content Analysis (SCAN) Cannot Distinguish Between Truthful and Fabricated Accounts of a Negative Event , 2016, Front. Psychol..

[67]  Suhang Wang,et al.  Fake News Detection on Social Media: A Data Mining Perspective , 2017, SKDD.

[68]  M. Cha,et al.  Rumor Detection over Varying Time Windows , 2017, PloS one.

[69]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[70]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[71]  David G. Rand,et al.  Who falls for fake news? The roles of bullshit receptivity, overclaiming, familiarity, and analytic thinking. , 2020, Journal of personality.

[72]  Vincenzo Auletta,et al.  Contrasting the Spread of Misinformation in Online Social Networks , 2017, AAMAS.

[73]  Verónica Pérez-Rosas,et al.  Automatic Detection of Fake News , 2017, COLING.

[74]  S. van der Linden,et al.  The fake news game: actively inoculating against the risk of misinformation , 2019 .

[75]  J. Pennebaker,et al.  Lying Words: Predicting Deception from Linguistic Styles , 2003, Personality & social psychology bulletin.

[76]  Ee-Peng Lim,et al.  Collective rumor correction on the death hoax of a political figure in social media , 2016 .

[77]  Alessandro Moschitti,et al.  Efficient Convolution Kernels for Dependency and Constituent Syntactic Trees , 2006, ECML.

[78]  Wei Gao,et al.  Detecting Rumors from Microblogs with Recurrent Neural Networks , 2016, IJCAI.

[79]  Emory Paine,et al.  The Next Step: Social Media and the Evolution of Journalism , 2015 .

[80]  J. Burgoon,et al.  Interpersonal Deception Theory , 1996 .

[81]  Misako Takayasu,et al.  Rumor Diffusion and Convergence during the 3.11 Earthquake: A Twitter Case Study , 2015, PloS one.

[82]  Shimon Kogan,et al.  Fake News: Evidence from Financial Markets , 2019 .

[83]  G. Caldarelli,et al.  The spreading of misinformation online , 2016, Proceedings of the National Academy of Sciences.

[84]  Carlos Carvalho,et al.  The persistent effects of a false news shock , 2011 .

[85]  Roberto Di Pietro,et al.  Fame for sale: Efficient detection of fake Twitter followers , 2015, Decis. Support Syst..

[86]  M. Gentzkow,et al.  Social Media and Fake News in the 2016 Election , 2017 .

[87]  Christopher Meek,et al.  Semantic Parsing for Single-Relation Question Answering , 2014, ACL.

[88]  D. Biber,et al.  Longman Grammar of Spoken and Written English , 1999 .

[89]  Arkaitz Zubiaga,et al.  Analysing How People Orient to and Spread Rumours in Social Media by Looking at Conversational Threads , 2015, PloS one.

[90]  Neil Shah,et al.  False Information on Web and Social Media: A Survey , 2018, ArXiv.

[91]  Mark Steedman,et al.  Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning , 2012 .

[92]  Kenny Q. Zhu,et al.  False rumors detection on Sina Weibo by propagation structures , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[93]  Gerhard Weikum,et al.  Leveraging Joint Interactions for Credibility Analysis in News Communities , 2015, CIKM.

[94]  Dragomir R. Radev,et al.  Rumor has it: Identifying Misinformation in Microblogs , 2011, EMNLP.

[95]  Filippo Menczer,et al.  BotOrNot: A System to Evaluate Social Bots , 2016, WWW.

[96]  Scott Counts,et al.  Tweeting is believing?: understanding microblog credibility perceptions , 2012, CSCW.

[97]  Craig L. Silverman,et al.  Lies, Damn Lies and Viral Content , 2015 .

[98]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[99]  P. Metaxas,et al.  The infamous #Pizzagate conspiracy theory: Insight from a TwitterTrails investigation , 2017 .

[100]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[101]  Wei Gao,et al.  Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning , 2017, ACL.

[102]  Galit Nahari,et al.  Does the truth come out in the writing? Scan as a lie detection tool. , 2012, Law and human behavior.

[103]  Andreas Vlachos,et al.  Fact Checking: Task definition and dataset construction , 2014, LTCSS@ACL.

[104]  A. Roets,et al.  ‘Fake news’: Incorrect, but hard to correct. The role of cognitive ability on the impact of false information on social impressions , 2017 .

[105]  Naren Ramakrishnan,et al.  Epidemiological modeling of news and rumors on Twitter , 2013, SNAKDD '13.

[106]  Sahila Chopra,et al.  Towards Automatic Identification of Fake News: Headline-Article Stance Detection with LSTM Attention Models , 2017 .

[107]  Nam P. Nguyen,et al.  Containment of misinformation spread in online social networks , 2012, WebSci '12.

[108]  Arkaitz Zubiaga,et al.  Detection and Resolution of Rumours in Social Media , 2017, ACM Comput. Surv..

[109]  Jure Leskovec,et al.  Disinformation on the Web: Impact, Characteristics, and Detection of Wikipedia Hoaxes , 2016, WWW.

[110]  Eunsol Choi,et al.  Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking , 2017, EMNLP.

[111]  Divyakant Agrawal,et al.  Limiting the spread of misinformation in social networks , 2011, WWW.

[112]  J. Nunamaker,et al.  Automating Linguistics-Based Cues for Detecting Deception in Text-Based Asynchronous Computer-Mediated Communications , 2004 .

[113]  Rick L. Wilson,et al.  Decision support for determining veracity via linguistic-based cues , 2009, Decis. Support Syst..

[114]  Sebastian Tschiatschek,et al.  Detecting Fake News in Social Networks via Crowdsourcing , 2017, ArXiv.

[115]  Amos Azaria,et al.  The DARPA Twitter Bot Challenge , 2016, Computer.

[116]  Soroush Vosoughi,et al.  Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning , 2017, ACL.

[117]  Yevgeniy Vorobeychik,et al.  Adversarial Classification on Social Networks , 2018, AAMAS.

[118]  Le Song,et al.  Fake News Mitigation via Point Process Based Intervention , 2017, ICML.

[119]  Huan Liu,et al.  Tracing Fake-News Footprints: Characterizing Social Media Messages by How They Propagate , 2018, WSDM.

[120]  David Mandell Freeman,et al.  Can You Spot the Fakes?: On the Limitations of User Feedback in Online Social Networks , 2017, WWW.

[121]  Geoffrey Leech,et al.  Grammatical word class variation within the British National Corpus sampler , 2002 .

[122]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[123]  Nicole A. Cooke Posttruth, Truthiness, and Alternative Facts: Information Behavior and Critical Information Consumption for a New Age , 2017, The Library Quarterly.

[124]  A. Anderson Social Media Use in 2018 , 2018 .

[125]  James W. Pennebaker,et al.  Linguistic Inquiry and Word Count (LIWC2007) , 2007 .