Cascade size prediction in online social networks

Cascades represent an important phenomenon across various disciplines such as sociology, economy, psychology, political science, marketing, and epidemiology. The goal of this paper is to develop a model for cascade size prediction in online social networks. Specifically, given the first n edges in a cascade, we want to predict whether the cascade will have a total of at least t2 (t2 > t1) edges over its lifetime without any a priori information. In this paper, we propose a Multi-order Markov Model (M3) for cascade size prediction in online social networks. Our evaluations using a Twitter data set show that M3 based cascade size prediction scheme outperforms the baseline scheme based on cascade graph features such as edge growth rate, degree distribution, clustering, and diameter. M3 based cascade size prediction scheme consistently achieves more than 90% prediction accuracy in different experimental scenarios.

[1]  Vwani P. Roychowdhury,et al.  Information resonance on Twitter: watching Iran , 2010, SOMA '10.

[2]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[3]  Donald F. Towsley,et al.  Characterizing continuous time random walks on time varying graphs , 2012, SIGMETRICS '12.

[4]  Vicenç Gómez,et al.  Modeling the structure and evolution of discussion cascades , 2010, HT '11.

[5]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[6]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[7]  Jure Leskovec,et al.  Can cascades be predicted? , 2014, WWW.

[8]  Lada A. Adamic,et al.  The Anatomy of Large Facebook Cascades , 2013, ICWSM.

[9]  E. Rogers,et al.  Diffusion of innovations , 1964, Encyclopedia of Sport Management.

[10]  Krishna P. Gummadi,et al.  A measurement-driven analysis of information propagation in the flickr social network , 2009, WWW '09.

[11]  Jure Leskovec,et al.  Patterns of Influence in a Recommendation Network , 2006, PAKDD.

[12]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[13]  Ramanathan V. Guha,et al.  The predictive power of online chatter , 2005, KDD '05.

[14]  Christos Faloutsos,et al.  Cascading Behavior in Large Blog Graphs , 2007 .

[15]  Jure Leskovec,et al.  Correcting for missing data in information cascades , 2011, WSDM '11.

[16]  Krishna P. Gummadi,et al.  On word-of-mouth based discovery of the web , 2011, IMC '11.

[17]  Béla Bollobás,et al.  Modern Graph Theory , 2002, Graduate Texts in Mathematics.

[18]  Jure Leskovec,et al.  Inferring networks of diffusion and influence , 2010, KDD.

[19]  Jure Leskovec,et al.  Sentiment Flow Through Hyperlink Networks , 2011, ICWSM.

[20]  Vasudeva Varma,et al.  Modelling Action Cascades in Social Networks , 2011, ICWSM.

[21]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[22]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[23]  Yidong Sun,et al.  The statistic "number of udu's" in Dyck paths , 2004, Discret. Math..

[24]  Jon Kleinberg,et al.  Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter , 2011, WWW.

[25]  J. A. Bondy,et al.  Graph Theory , 2008, Graduate Texts in Mathematics.

[26]  Norman D. Black,et al.  Second-generation image coding: an overview , 1997, CSUR.

[27]  I. MacMillan,et al.  Resource Cooptation Via Social Contracting: Resource Acquisition Strategies for New Ventures , 1990 .

[28]  Wolfgang Kellerer,et al.  Outtweeting the Twitterers - Predicting Information Cascades in Microblogs , 2010, WOSN.

[29]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[30]  Xiaotong Li,et al.  Informational cascades in IT adoption , 2004, CACM.

[31]  Sun-Yuan Hsieh,et al.  A DNA-based graph encoding scheme with its applications to graph isomorphism problems , 2008, Appl. Math. Comput..

[32]  Hayder Radha,et al.  Markov-based modeling of wireless local area networks , 2003, MSWIM '03.

[33]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[34]  K. R. Rao,et al.  The Transform and Data Compression Handbook , 2000 .

[35]  D. Vere-Jones Markov Chains , 1972, Nature.

[36]  Jerry D. Gibson,et al.  Digital coding of waveforms: Principles and applications to speech and video , 1985, Proceedings of the IEEE.