Tweets are forever: a large-scale quantitative analysis of deleted tweets

This paper describes an empirical study of 1.6M deleted tweets collected over a continuous one-week period from a set of 292K Twitter users. We examine several aggregate properties of deleted tweets, including their connections to other tweets (e.g., whether they are replies or retweets), the clients used to produce them, temporal aspects of deletion, and the presence of geotagging information. Some significant differences were discovered between the two collections, namely in the clients used to post them, their conversational aspects, the sentiment vocabulary present in them, and the days of the week they were posted. However, in other dimensions for which analysis was possible, no substantial differences were found. Finally, we discuss some ramifications of this work for understanding Twitter usage and management of one's privacy.

[1]  Fang Wu,et al.  Social Networks that Matter: Twitter Under the Microscope , 2008, First Monday.

[2]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[3]  Yang Wang,et al.  "I regretted the minute I pressed share": a qualitative study of regrets on Facebook , 2011, SOUPS.

[4]  Susan T. Dumais,et al.  Characterizing Microblogs with Topic Models , 2010, ICWSM.

[5]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[6]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[7]  Alice E. Marwick,et al.  Social Privacy in Networked Publics: Teens’ Attitudes, Practices, and Strategies , 2011 .

[8]  HongJason,et al.  A comparative study of location-sharing privacy preferences in the United States and China , 2013 .

[9]  Balachander Krishnamurthy,et al.  A few chirps about twitter , 2008, WOSN '08.

[10]  Jason I. Hong,et al.  A Comparative Study of Location-sharing Privacy Preferences in the U.S. and China (CMU-CyLab-12-003) , 2012 .

[11]  Lorrie Faith Cranor,et al.  Understanding and capturing people’s privacy policies in a mobile social networking application , 2009, Personal and Ubiquitous Computing.

[12]  Kate Raynes-Goldie,et al.  Aliases, Creeping, and Wall Cleaning: Understanding Privacy in the Age of Facebook , 2010, First Monday.

[13]  Mor Naaman,et al.  Is it really about me?: message content in social awareness streams , 2010, CSCW '10.

[14]  Lorrie Faith Cranor,et al.  Empirical models of privacy in location sharing , 2010, UbiComp.

[15]  Duncan J. Watts,et al.  Who says what to whom on twitter , 2011, WWW.

[16]  Zeynep Tufekci,et al.  Facebook, Youth and Privacy in Networked Publics , 2012, ICWSM.

[17]  Michael S. Bernstein,et al.  Who gives a tweet?: evaluating microblog content value , 2012, CSCW.

[18]  Mary Beth Rosson,et al.  How and why people Twitter: the role that micro-blogging plays in informal communication at work , 2009, GROUP.

[19]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[20]  Cliff Lampe,et al.  Changes in use and perception of facebook , 2008, CSCW.

[21]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.