A comprehensive analysis of tweet content and its impact on popularity

By the appearance of the online social networks, ordinary people have gained more chance to make and publish content. However, for audiences, as the number of these shared contents grows, the importance of detecting important and related ones increases. So, a significant question is how much would a shared content become popular among audiences, regardless of the source of content? In this paper, we investigate this question by performing a comprehensive analysis of tweet content and studying its impact on popularity of tweet. Here, the number of retweets of tweet is used as a popularity measure. We show that tweets with “social” content, have in general more chance of popularity due to their attraction for society. In contrast, tweets with “individual” content, have little chance to get popular. We collect a fair data set of tweets. In order to do more detailed investigation and access the semantic features, we set an annotation and labeling process. We analyze the informativeness of content-based features and use them to train predictive models. The results clearly show the importance of content-based features. They specifically support this idea that specifying whether a tweet is speaking about an individual or social subject, is the most informative content-based feature to predict the popularity i.e. the number of retweets.

[1]  Brian D. Davison,et al.  Predicting popular messages in Twitter , 2011, WWW.

[2]  Takafumi Suzuki,et al.  Adding Twitter‐specific features to stylistic features for classifying tweets by user type and number of retweets , 2014, J. Assoc. Inf. Sci. Technol..

[3]  Andras A. Benczur,et al.  Temporal prediction of retweet count , 2013, 2013 IEEE 4th International Conference on Cognitive Infocommunications (CogInfoCom).

[4]  Miles Osborne,et al.  RT to Win! Predicting Message Propagation in Twitter , 2011, ICWSM.

[5]  Chung-Hong Lee,et al.  Leveraging microblogging big data with a modified density-based clustering approach for event awareness and topic ranking , 2013, J. Inf. Sci..

[6]  Bernardo A. Huberman,et al.  The Pulse of News in Social Media: Forecasting Popularity , 2012, ICWSM.

[7]  Jung-Tae Lee,et al.  Finding interesting posts in Twitter based on retweet graph analysis , 2012, SIGIR '12.

[8]  Lun Zhang,et al.  Content or Context : Which Carries More Weight in Predicting Popularity of Tweets in China , 2012 .

[9]  Felix Naumann,et al.  Analyzing and predicting viral tweets , 2013, WWW.

[10]  Sanghee Oh,et al.  Why do social network site users share information on Facebook and Twitter? , 2015, J. Inf. Sci..

[11]  Bin Wu,et al.  A Two-Phase Model for Retweet Number Prediction , 2014, WAIM.

[12]  R. Manmatha,et al.  Predicting retweet count using visual cues , 2013, CIKM.

[13]  Gleb Gusev,et al.  Prediction of retweet cascade size over time , 2012, CIKM.

[14]  Ed H. Chi,et al.  Want to be Retweeted? Large Scale Analytics on Factors Impacting Retweet in Twitter Network , 2010, 2010 IEEE Second International Conference on Social Computing.

[15]  Ting Wang,et al.  Who will retweet me?: finding retweeters in twitter , 2013, SIGIR.

[16]  Qing Yang,et al.  Analyzing User Retweet Behavior on Twitter , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[17]  Thomas Gottron,et al.  Bad news travel fast: a content-based analysis of interestingness on Twitter , 2011, WebSci '11.

[18]  Subbarao Kambhampati,et al.  Ranking tweets considering trust and relevance , 2012, IIWeb '12.