False Information on Web and Social Media: A Survey

False information can be created and spread easily through the web and social media platforms, resulting in widespread real-world impact. Characterizing how false information proliferates on social platforms and why it succeeds in deceiving readers are critical to develop efficient detection algorithms and tools for early detection. A recent surge of research in this area has aimed to address the key issues using methods based on feature engineering, graph mining, and information modeling. Majority of the research has primarily focused on two broad categories of false information: opinion-based (e.g., fake reviews), and fact-based (e.g., false news and hoaxes). Therefore, in this work, we present a comprehensive survey spanning diverse aspects of false information, namely (i) the actors involved in spreading false information, (ii) rationale behind successfully deceiving readers, (iii) quantifying the impact of false information, (iv) measuring its characteristics across different dimensions, and finally, (iv) algorithms developed to detect false information. In doing so, we create a unified framework to describe these recent methods and highlight a number of important directions for future research.

[1]  B. Nyhan,et al.  When Corrections Fail: The Persistence of Political Misperceptions , 2010 .

[2]  J. Druckman,et al.  The Nature and Origins of Misperceptions: Understanding False and Unsupported Beliefs About Politics , 2017 .

[3]  Naren Ramakrishnan,et al.  Epidemiological modeling of news and rumors on Twitter , 2013, SNAKDD '13.

[4]  Weixiang Shao,et al.  Bimodal Distribution and Co-Bursting in Review Spam Detection , 2017, WWW.

[5]  Jure Leskovec,et al.  Disinformation on the Web: Impact, Characteristics, and Detection of Wikipedia Hoaxes , 2016, WWW.

[6]  Divyakant Agrawal,et al.  Limiting the spread of misinformation in social networks , 2011, WWW.

[7]  Kate Starbird,et al.  Alternative Narratives of Crisis Events: Communities and Social Botnets Engaged on Social Media , 2017, CSCW Companion.

[8]  M. Gentzkow,et al.  Social Media and Fake News in the 2016 Election , 2017 .

[9]  Andrew T. Perrin Social Media Usage: 2005-2015 , 2015 .

[10]  Giovanni Luca Ciampaglia,et al.  The spread of low-credibility content by social bots , 2017, Nature Communications.

[11]  Philip N. Howard,et al.  Bots, #StrongerIn, and #Brexit: Computational Propaganda during the UK-EU Referendum , 2016, ArXiv.

[12]  V. S. Subrahmanian,et al.  Predicting human behavior: The next frontiers , 2017, Science.

[13]  Dragomir R. Radev,et al.  Rumor has it: Identifying Misinformation in Microblogs , 2011, EMNLP.

[14]  Tim Weninger,et al.  Random Voting Effects in Social-Digital Spaces: A Case Study of Reddit Post Submissions , 2015, HT.

[15]  Filippo Menczer,et al.  BotOrNot: A System to Evaluate Social Bots , 2016, WWW.

[16]  Michael Luca,et al.  Fake It Till You Make It: Reputation, Competition, and Yelp Review Fraud , 2015 .

[17]  Claire Cardie,et al.  Negative Deceptive Opinion Spam , 2013, NAACL.

[18]  Christos Faloutsos,et al.  REV2: Fraudulent User Prediction in Rating Platforms , 2018, WSDM.

[19]  Craig L. Silverman,et al.  Lies, Damn Lies and Viral Content , 2015 .

[20]  Sungyong Seo,et al.  CSI: A Hybrid Deep Model for Fake News Detection , 2017, CIKM.

[21]  Загоровская Ольга Владимировна,et al.  Исследование влияния пола и психологических характеристик автора на количественные параметры его текста с использованием программы Linguistic Inquiry and Word Count , 2015 .

[22]  Jeffrey A. Gottfried,et al.  News use across social media platforms 2016 , 2016 .

[23]  David G. Rand,et al.  Who Falls for Fake News? The Roles of Bullshit Receptivity, Overclaiming, Familiarity, and Analytic Thinking , 2017, Journal of personality.

[24]  Gianluca Stringhini,et al.  The web centipede: understanding how web communities influence each other through the lens of mainstream and alternative news sources , 2017, Internet Measurement Conference.

[25]  Duncan J. Watts,et al.  Everyone's an influencer: quantifying influence on twitter , 2011, WSDM '11.

[26]  Eric Gilbert,et al.  A Parsimonious Language Model of Social Media Credibility Across Disparate Events , 2017, CSCW.

[27]  Anupam Joshi,et al.  Faking Sandy: characterizing and identifying fake images on Twitter during Hurricane Sandy , 2013, WWW.

[28]  Amos Azaria,et al.  The DARPA Twitter Bot Challenge , 2016, Computer.

[29]  Huan Liu,et al.  Exploiting Tri-Relationship for Fake News Detection , 2017, ArXiv.

[30]  Philip N. Howard,et al.  Political Bots and the Manipulation of Public Opinion in Venezuela , 2015, ArXiv.

[31]  Arkaitz Zubiaga,et al.  Analysing How People Orient to and Spread Rumours in Social Media by Looking at Conversational Threads , 2015, PloS one.

[32]  Claire Cardie,et al.  Towards a General Rule for Identifying Deceptive Opinion Spam , 2014, ACL.

[33]  Yi Yang,et al.  Learning to Identify Review Spam , 2011, IJCAI.

[34]  Walter Quattrociocchi,et al.  Echo Chambers on Facebook , 2016 .

[35]  Martin Ester,et al.  Detecting Singleton Review Spammers Using Semantic Similarity , 2015, WWW.

[36]  Filippo Menczer,et al.  The rise of social bots , 2014, Commun. ACM.

[37]  Alexander J. Smola,et al.  CoBaFi: collaborative bayesian filtering , 2014, WWW.

[38]  V. S. Subrahmanian,et al.  VEWS: A Wikipedia Vandal Early Warning System , 2015, KDD.

[39]  Nam P. Nguyen,et al.  Containment of misinformation spread in online social networks , 2012, WebSci '12.

[40]  Jiajia Wang,et al.  SIR rumor spreading model in the new media age , 2013 .

[41]  Arkaitz Zubiaga,et al.  Detection and Resolution of Rumours in Social Media , 2017, ACM Comput. Surv..

[42]  Aoying Zhou,et al.  Towards online review spam detection , 2014, WWW.

[43]  Huan Liu,et al.  Beyond News Contents: The Role of Social Context for Fake News Detection , 2017, WSDM.

[44]  Edward S. Reed,et al.  Values and Knowledge , 1987, A Centre of Excellence.

[45]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[46]  Ben Y. Zhao,et al.  Automated Crowdturfing Attacks and Defenses in Online Review Systems , 2017, CCS.

[47]  P. Hernon Disinformation and misinformation through the internet: Findings of an exploratory study , 1995 .

[48]  S. Asch Effects of Group Pressure Upon the Modification and Distortion of Judgments , 1951 .

[49]  Filippo Menczer,et al.  The spread of fake news by social bots , 2017, ArXiv.

[50]  A. Leiserowitz,et al.  Inoculating the Public against Misinformation about Climate Change , 2017, Global challenges.

[51]  R. Nickerson Confirmation Bias: A Ubiquitous Phenomenon in Many Guises , 1998 .

[52]  E. Reed,et al.  Naive Realism in Everyday Life: Implications for Social Conflict and Misunderstanding , 2013 .

[53]  Emilio Ferrara,et al.  Social Bots Distort the 2016 US Presidential Election Online Discussion , 2016, First Monday.

[54]  E. Thom Statements of Fact, Statements of Opinion, and the First Amendment , 1986 .

[55]  Philip S. Yu,et al.  Review spam detection via temporal pattern discovery , 2012, KDD.

[56]  Michalis Faloutsos,et al.  TrueView: Harnessing the Power of Multiple Review Sites , 2015, WWW.

[57]  Shah Neil,et al.  EdgeCentric: Anomaly Detection in Edge-Attributed Networks , 2016 .

[58]  Sinan Aral,et al.  The spread of true and false news online , 2018, Science.

[59]  Bernhard Schölkopf,et al.  Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation , 2017, WSDM.

[60]  Scott F. Aikin,et al.  Poe's Law, Group Polarization, and the Epistemology of Online Religious Discourse , 2009 .

[61]  Li Zeng,et al.  Rumors at the Speed of Light? Modeling the Rate of Rumor Transmission During Crisis , 2016, 2016 49th Hawaii International Conference on System Sciences (HICSS).

[62]  Christos Faloutsos,et al.  The Many Faces of Link Fraud , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[63]  Suhang Wang,et al.  Fake News Detection on Social Media: A Data Mining Perspective , 2017, SKDD.

[64]  Sameep Mehta,et al.  A study of rumor control strategies on social networks , 2010, CIKM.

[65]  Christopher G. Harris Detecting Deceptive Opinion Spam Using Human Computation , 2012, HCOMP@AAAI.

[66]  Eugenio Tacchini,et al.  Some Like it Hoax: Automated Fake News Detection in Social Networks , 2017, ArXiv.

[67]  Leman Akoglu,et al.  Collective Opinion Spam Detection: Bridging Review Networks and Metadata , 2015, KDD.

[68]  Kenny Q. Zhu,et al.  False rumors detection on Sina Weibo by propagation structures , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[69]  Sibel Adali,et al.  This Just In: Fake News Packs a Lot in Title, Uses Simpler, Repetitive Content in Text Body, More Similar to Satire than Real News , 2017, Proceedings of the International AAAI Conference on Web and Social Media.

[70]  Mahmoud Fouz,et al.  Why rumors spread so quickly in social networks , 2012, Commun. ACM.

[71]  Arjun Mukherjee,et al.  Spotting fake reviewer groups in consumer reviews , 2012, WWW.

[72]  Ben Y. Zhao,et al.  Uncovering social network sybils in the wild , 2011, IMC '11.

[73]  Santhosh Kumar,et al.  Temporal Opinion Spam Detection by Multivariate Indicative Signals , 2016, ICWSM.

[74]  Philip S. Yu,et al.  Review Graph Based Online Store Review Spammer Detection , 2011, 2011 IEEE 11th International Conference on Data Mining.

[75]  Ira Kemelmacher-Shlizerman,et al.  Synthesizing Obama , 2017, ACM Trans. Graph..

[76]  Christos Faloutsos,et al.  Netprobe: a fast and scalable system for fraud detection in online auction networks , 2007, WWW '07.

[77]  Arjun Mukherjee,et al.  What Yelp Fake Review Filter Might Be Doing? , 2013, ICWSM.

[78]  Justin Cheng,et al.  Rumor Cascades , 2014, ICWSM.

[79]  Yongdong Zhang,et al.  News Verification by Exploiting Conflicting Social Viewpoints in Microblogs , 2016, AAAI.

[80]  Filippo Menczer,et al.  Hoaxy: A Platform for Tracking Online Misinformation , 2016, WWW.

[81]  Fan Yang,et al.  Automatic detection of rumor on Sina Weibo , 2012, MDS '12.

[82]  V. S. Subrahmanian,et al.  Predicting human behavior : The next , 2017 .

[83]  Verónica Pérez-Rosas,et al.  Automatic Detection of Fake News , 2017, COLING.

[84]  Simon M. Huttegger Signals: Evolution, Learning and InformationBy Brian Skyrms , 2011 .

[85]  D. Fallis A Functional Analysis of Disinformation , 2014 .

[86]  Kate Starbird,et al.  Examining the Alternative Media Ecosystem Through the Production of Alternative Narratives of Mass Shooting Events on Twitter , 2017, ICWSM.

[87]  William Yang Wang “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection , 2017, ACL.

[88]  Christos Faloutsos,et al.  BIRDNEST: Bayesian Inference for Ratings-Fraud Detection , 2015, SDM.

[89]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[90]  Kate Starbird,et al.  Rumors, False Flags, and Digital Vigilantes: Misinformation on Twitter after the 2013 Boston Marathon Bombing , 2014 .

[91]  Christos Faloutsos,et al.  Inferring Strange Behavior from Connectivity Pattern in Social Networks , 2014, PAKDD.

[92]  Nitesh Saxena,et al.  All Your Voices are Belong to Us: Stealing Voices to Fool Humans and Machines , 2015, ESORICS.

[93]  V. S. Subrahmanian,et al.  An Army of Me: Sockpuppets in Online Discussion Communities , 2017, WWW.

[94]  Arjun Mukherjee,et al.  Analyzing and Detecting Opinion Spam on a Large-scale Dataset via Temporal and Spatial Patterns , 2015, ICWSM.

[95]  D. Fallis A Conceptual Analysis of Disinformation , 2009 .

[96]  Victoria L. Rubin,et al.  Fake News or Truth? Using Satirical Cues to Detect Potentially Misleading News , 2016 .

[97]  Jure Leskovec,et al.  Antisocial Behavior in Online Discussion Communities , 2015, ICWSM.

[98]  G. Caldarelli,et al.  The spreading of misinformation online , 2016, Proceedings of the National Academy of Sciences.

[99]  Asuman E. Ozdaglar,et al.  Spread of (Mis)Information in Social Networks , 2009, Games Econ. Behav..

[100]  Filippo Menczer,et al.  Measuring Online Social Bubbles , 2015, 1502.07162.

[101]  A. Hermida TWITTERING THE NEWS , 2010 .

[102]  Jure Leskovec,et al.  Can cascades be predicted? , 2014, WWW.

[103]  Neil Shah,et al.  FLOCK: Combating Astroturfing on Livestreaming Platforms , 2016, WWW.

[104]  Barbara Poblete,et al.  Twitter under crisis: can we trust what we RT? , 2010, SOMA '10.

[105]  Eric Gilbert,et al.  CREDBANK: A Large-Scale Social Media Corpus With Associated Credibility Annotations , 2015, ICWSM.

[106]  Christos Faloutsos,et al.  oddball: Spotting Anomalies in Weighted Graphs , 2010, PAKDD.

[107]  Christos Faloutsos,et al.  Opinion Fraud Detection in Online Reviews by Network Effects , 2013, ICWSM.

[108]  Venkatesan Guruswami,et al.  CopyCatch: stopping group attacks by spotting lockstep behavior in social networks , 2013, WWW.

[109]  R. Zajonc Attitudinal effects of mere exposure. , 1968 .

[110]  Tim Berners-Lee,et al.  World-Wide Web: The Information Universe , 1992, Electron. Netw. Res. Appl. Policy.

[111]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.