Anonymity Effects: A Large-Scale Dataset from an Anonymous Social Media Platform

Today online social media sites function as the medium of expression for billions of users. As a result, aside from conventional social media sites like Facebook and Twitter, platform designers introduced many alternative social media platforms (e.g., 4chan, Whisper, Snapchat, Mastodon) to serve specific userbases. Among these platforms, anonymous social media sites like Whisper and 4chan hold a special place for researchers. Unlike conventional social media sites, posts on anonymous social media sites are not associated with persistent user identities or profiles. Thus, these anonymous social media sites can provide an extremely interesting data-driven lens into the effects of anonymity on online user behavior. However, to the best of our knowledge, currently there are no publicly available datasets to facilitate research efforts on these anonymity effects. To that end, in this paper, we aim to publicly release the first ever large-scale dataset from Whisper, a large anonymous online social media platform. Specifically, our dataset contains 89.8 Million Whisper posts (called "whispers'') published between a 2-year period from June 6, 2014 to June 6, 2016 (when Whisper was quite popular). Each of these whispers contained both post text and associated metadata. The metadata contains information like coarse-grained location of upload and categories of whispers. We also present preliminary descriptive statistics to demonstrate a significant language and categorical diversity in our dataset. We leverage previous work as well as novel analysis to demonstrate that the whispers contain personal emotions and opinions (likely facilitated by a disinhibition complex due to anonymity). Consequently, we envision that our dataset will facilitate novel research ranging from understanding online aggression to detect depression within online populace.

[1]  Ben Y. Zhao,et al.  Whispers in the dark: analysis of an anonymous social network , 2014, Internet Measurement Conference.

[2]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[3]  J. Turner Social categorization and the self-concept: A social cognitive theory of group behavior. , 2010 .

[4]  Krishna P. Gummadi,et al.  A measurement-driven analysis of information propagation in the flickr social network , 2009, WWW '09.

[5]  Lorrie Faith Cranor,et al.  The post that wasn't: exploring self-censorship on facebook , 2013, CSCW.

[6]  Gianluca Stringhini,et al.  Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior , 2018, ICWSM.

[7]  John Suler,et al.  The Online Disinhibition Effect , 2004, Cyberpsychology Behav. Soc. Netw..

[8]  Animesh Mukherjee,et al.  Deep Dive into Anonymity: A Large Scale Analysis of Quora Questions , 2018, SocInfo.

[9]  Krishna P. Gummadi,et al.  The Many Shades of Anonymity: Characterizing Anonymous Social Media Content , 2021, ICWSM.

[10]  Fabrício Benevenuto,et al.  Analyzing the Targets of Hate in Online Social Media , 2016, ICWSM.

[11]  Alain Pinsonneault,et al.  Anonymity in Group Support Systems Research: A New Conceptualization, Measure, and Contingency Framework , 1997, J. Manag. Inf. Syst..

[12]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[13]  P. Zimbardo The human choice: Individuation, reason, and order versus deindividuation, impulse, and chaos. , 1969 .

[14]  Friedrich-Schiller-Universitat Jena Common Bond and Common Identity Groups on the Internet: Attachment and Normative Behavior in On-Topic and Off-Topic Chats , 2002 .

[15]  Krishna P. Gummadi,et al.  You are who you know: inferring user profiles in online social networks , 2010, WSDM '10.

[16]  Michael S. Bernstein,et al.  4chan and /b/: An Analysis of Anonymity and Ephemerality in a Large Online Community , 2011, ICWSM.

[17]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.