On the role of conductance, geography and topology in predicting hashtag virality

We focus on three aspects of the early spread of a hashtag in order to predict whether it will go viral: the network properties of the subset of users tweeting the hashtag, its geographical properties, and, most importantly, its conductance-related properties. One of our significant contributions is to discover the critical role played by the conductance-based features for the successful prediction of virality. More specifically, we show that the second derivative of the conductance gives an early indication of whether the hashtag is going to go viral or not. We present a detailed experimental evaluation of the effect of our various categories of features on the virality prediction task. When compared to the baselines and the state-of-the-art techniques proposed in the literature our feature set is able to achieve significantly better accuracy on a large dataset of 7.7 million users and all their tweets over a period of month, as well as on existing datasets.

[1]  Jon Kleinberg,et al.  Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter , 2011, WWW.

[2]  James Caverlee,et al.  Location prediction in social media based on tie strength , 2013, CIKM.

[3]  Filippo Menczer,et al.  Virality Prediction and Community Structure in Social Networks , 2013, Scientific Reports.

[4]  Lev Muchnik,et al.  Identifying influential spreaders in complex networks , 2010, 1001.5285.

[5]  Jure Leskovec,et al.  Information diffusion and external influence in networks , 2012, KDD.

[6]  Bernardo A. Huberman,et al.  Predicting the popularity of online content , 2008, Commun. ACM.

[7]  Tad Hogg,et al.  Using a model of social dynamics to predict popularity of news , 2010, WWW '10.

[8]  Mark Jerrum,et al.  Approximating the Permanent , 1989, SIAM J. Comput..

[9]  Venkatesan Guruswami Rapidly Mixing Markov Chains: A Comparison of Techniques (A Survey) , 2016, ArXiv.

[10]  Greg Dalziel Rumor and Communication in Asia in the Internet Age , 2013 .

[11]  Fang Wu,et al.  Novelty and collective attention , 2007, Proceedings of the National Academy of Sciences.

[12]  Emily B. Fox,et al.  A Bayesian Approach for Predicting the Popularity of Tweets , 2013, ArXiv.

[13]  A. Vespignani,et al.  Competition among memes in a world with limited attention , 2012, Scientific Reports.

[14]  Silvio Lattanzi,et al.  Almost tight bounds for rumour spreading with conductance , 2010, STOC '10.

[15]  Filippo Menczer,et al.  Predicting Successful Memes Using Network and Community Structure , 2014, ICWSM.

[16]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.

[17]  Rediet Abebe Can Cascades be Predicted? , 2014 .

[18]  David F. Gleich,et al.  Vertex neighborhoods, low conductance cuts, and good seeds for local community methods , 2012, KDD.

[19]  Felix Naumann,et al.  Analyzing and predicting viral tweets , 2013, WWW.

[20]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[21]  Ed H. Chi,et al.  Want to be Retweeted? Large Scale Analytics on Factors Impacting Retweet in Twitter Network , 2010, 2010 IEEE Second International Conference on Social Computing.

[22]  Steve Harenberg,et al.  Community detection in large‐scale networks: a survey and empirical evaluation , 2014 .

[23]  Gözde Özbal,et al.  Exploring Text Virality in Social Networks , 2011, ICWSM.

[24]  Lars Kai Hansen,et al.  Good Friends, Bad News - Affect and Virality in Twitter , 2011, ArXiv.

[25]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[26]  Marco Guerini,et al.  Exploring Image Virality in Google Plus , 2013, 2013 International Conference on Social Computing.

[27]  Anirban Mahanti,et al.  Spatio-temporal and events based analysis of topic popularity in twitter , 2013, CIKM.

[28]  Robert C. Holte,et al.  C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling beats Over-Sampling , 2003 .

[29]  Krishna P. Gummadi,et al.  Geographic Dissection of the Twitter Network , 2012, ICWSM.

[30]  Virgílio A. F. Almeida,et al.  The impact of visual attributes on online image diffusion , 2014, WebSci '14.

[31]  Gao Cong,et al.  On predicting the popularity of newly emerging hashtags in Twitter , 2013, J. Assoc. Inf. Sci. Technol..

[32]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[33]  Katherine L. Milkman,et al.  What Makes Online Content Viral? , 2012 .

[34]  Dylan Walker,et al.  Creating Social Contagion Through Viral Product Design: A Randomized Trial of Peer Influence in Networks , 2010, ICIS.

[35]  Amitabha Bagchi,et al.  Topic Diffusion and Emergence of Virality in Social Networks , 2012, ArXiv.

[36]  Kristina Lerman,et al.  A framework for quantitative analysis of cascades on networks , 2010, WSDM '11.