Analyzing Textual (Mis)Information Shared in WhatsApp Groups

Whatsapp is a messenger app that is currently very popular around the world. With a user-friendly interface, it allows people to instantaneously exchange messages in a very intuitive and fluid way. The app also allows people to interact using group chats, sharing messages, videos, audios, and images. These groups can also be a fertile ground to spread rumors and misinformation. In this work, we analyzed the messages shared on a number of political-oriented WhatsApp groups, focusing on textual content, as it is the most shared media type. Our study relied on a dataset containing all textual messages shared in those groups during the 2018 Brazilian presidential campaign. We identified the presence of misinformation in the contents of these messages using a dataset of priorly checked misinformation from six Brazilian fact-checking sites. Our study aims at identifying characteristics that distinguish such messages from the other textual messages (with unchecked content). To that end, we analyzed various properties of the textual content (e.g., language usage, main topics and sentiment of message's content) and propagation dynamics of both sets of messages. Our analyses revealed that textual messages with misinformation tend to be concentrated on fewer topics, often carrying words related to the cognitive process of insight, which characterizes chain messages. We also found that their propagation process is much more viral with a distinct behavior: they tend to propagate faster within particular groups but take longer to cross group boundaries.

[1]  Sinan Aral,et al.  The spread of true and false news online , 2018, Science.

[2]  Fabrício Benevenuto,et al.  (Mis)Information Dissemination in WhatsApp: Gathering, Analyzing and Countermeasures , 2019, WWW.

[3]  Virgílio A. F. Almeida,et al.  Fake news as we feel it: perception and conceptualization of the term "fake news" in the media , 2018, SocInfo.

[4]  Fabrício Benevenuto,et al.  Supervised Learning for Fake News Detection , 2019, IEEE Intelligent Systems.

[5]  Krishna P. Gummadi,et al.  Purple Feed: Identifying High Consensus News Posts on Social Media , 2018, AIES.

[6]  Wagner Meira,et al.  Analyzing and characterizing political discussions in WhatsApp public groups , 2018, ArXiv.

[7]  Phuoc Tran-Gia,et al.  Group-based communication in WhatsApp , 2016, 2016 IFIP Networking Conference (IFIP Networking) and Workshops.

[8]  Tech Cse,et al.  A Survey on Document Clustering with Similarity Measures , 2013 .

[9]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[10]  Timothy Baldwin,et al.  Automatic Evaluation of Topic Coherence , 2010, NAACL.

[11]  M. Thelwall,et al.  Sentiment Strength Detection in Short Informal Text 1 , 2010 .

[12]  Júlio Cesar dos Reis,et al.  Breaking the News: First Impressions Matter on Online News , 2015, ICWSM.

[13]  Martin Wattenberg,et al.  The Word Tree, an Interactive Visual Concordance , 2008, IEEE Transactions on Visualization and Computer Graphics.

[14]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[15]  Dan Bouhnik,et al.  WhatsApp Goes to School: Mobile Instant Messaging between Teachers and Students , 2014, J. Inf. Technol. Educ. Res..

[16]  Guido Caldarelli,et al.  Emotional Dynamics in the Age of Misinformation , 2015, PloS one.

[17]  Fenglong Ma,et al.  EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection , 2018, KDD.

[18]  Yifei Zhang,et al.  #DebateNight: The Role and Influence of Socialbots on Twitter During the 1st 2016 U.S. Presidential Debate , 2018, ICWSM.

[19]  Jacob Ratkiewicz,et al.  Detecting and Tracking Political Abuse in Social Media , 2011, ICWSM.

[20]  Suhang Wang,et al.  Fake News Detection on Social Media: A Data Mining Perspective , 2017, SKDD.

[21]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[22]  Kate Starbird,et al.  Ecosystem or Echo-System? Exploring Content Sharing across Alternative Media Domains , 2018, ICWSM.

[23]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[24]  M. Gentzkow,et al.  Social Media and Fake News in the 2016 Election , 2017 .

[25]  Mike Thelwall,et al.  Sentiment in Twitter events , 2011, J. Assoc. Inf. Sci. Technol..

[26]  Fabrício Benevenuto,et al.  10SENT: A stable sentiment analysis method based on the combination of off‐the‐shelf approaches , 2019, J. Assoc. Inf. Sci. Technol..

[27]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[28]  Mike Thelwall,et al.  Sentiment in short strength detection informal text , 2010 .

[29]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[30]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[31]  Venkata Rama Kiran Garimella,et al.  WhatsApp, Doc? A First Look at WhatsApp Public Group Data , 2018, ICWSM 2018.

[32]  Jing Song,et al.  Assessment of Tweet Credibility with LDA Features , 2015, WWW.

[33]  S. Wani,et al.  Efficacy of communication amongst staff members at plastic and reconstructive surgery section using smartphone and mobile WhatsApp , 2013, Indian journal of plastic surgery : official publication of the Association of Plastic Surgeons of India.

[34]  Eric Horvitz,et al.  Geographic and Temporal Trends in Fake News Consumption During the 2016 US Presidential Election , 2017, CIKM.