SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity

Social networking websites allow users to create and share content. Big information cascades of post resharing can form as users of these sites reshare others' posts with their friends and followers. One of the central challenges in understanding such cascading behaviors is in forecasting information outbreaks, where a single post becomes widely popular by being reshared by many users. In this paper, we focus on predicting the final number of reshares of a given post. We build on the theory of self-exciting point processes to develop a statistical model that allows us to make accurate predictions. Our model requires no training or expensive feature engineering. It results in a simple and efficiently computable formula that allows us to answer questions, in real-time, such as: Given a post's resharing history so far, what is our current estimate of its final number of reshares? Is the post resharing cascade past the initial stage of explosive growth? And, which posts will be the most reshared in the future? We validate our model using one month of complete Twitter data and demonstrate a strong improvement in predictive accuracy over existing approaches. Our model gives only 15% relative error in predicting final size of an average information cascade after observing it for just one hour.

[1]  Shuang-Hong Yang,et al.  Mixture of Mutually Exciting Processes for Viral Diffusion , 2013, ICML.

[2]  R. Durrett Probability: Theory and Examples , 1993 .

[3]  Jure Leskovec,et al.  Can cascades be predicted? , 2014, WWW.

[4]  Yosihiko Ogata,et al.  Statistical Models for Earthquake Occurrences and Residual Analysis for Point Processes , 1988 .

[5]  Brian D. Davison,et al.  Predicting popular messages in Twitter , 2011, WWW.

[6]  Albert-László Barabási,et al.  Modeling and Predicting Popularity Dynamics via Reinforced Poisson Processes , 2014, AAAI.

[7]  Emily B. Fox,et al.  A Bayesian Approach for Predicting the Popularity of Tweets , 2013, ArXiv.

[8]  Bernhard Schölkopf,et al.  Uncovering the structure and temporal dynamics of information propagation , 2014, Network Science.

[9]  Didier Sornette,et al.  Robust dynamic classes revealed by measuring the response function of a social system , 2008, Proceedings of the National Academy of Sciences.

[10]  Thomas Gottron,et al.  Bad news travel fast: a content-based analysis of interestingness on Twitter , 2011, WebSci '11.

[11]  Bernhard Schölkopf,et al.  Modeling Information Propagation with Survival Theory , 2013, ICML.

[12]  Miles Osborne,et al.  RT to Win! Predicting Message Propagation in Twitter , 2011, ICWSM.

[13]  Gleb Gusev,et al.  Prediction of retweet cascade size over time , 2012, CIKM.

[14]  Ed H. Chi,et al.  Want to be Retweeted? Large Scale Analytics on Factors Impacting Retweet in Twitter Network , 2010, 2010 IEEE Second International Conference on Social Computing.

[15]  Le Song,et al.  Learning Social Infectivity in Sparse Low-rank Networks Using Multi-dimensional Hawkes Processes , 2013, AISTATS.

[16]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[17]  Jon M. Kleinberg,et al.  Tracing information flow on a global scale using Internet chain-letter data , 2008, Proceedings of the National Academy of Sciences.

[18]  Le Song,et al.  Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm , 2014, ICML.

[19]  Ralf Herbrich,et al.  Predicting Information Spreading in Twitter , 2010 .

[20]  Shuai Gao,et al.  Modeling and Predicting Retweeting Dynamics on Microblogging Platforms , 2015, WSDM.

[21]  W. Marsden I and J , 2012 .

[22]  Jure Leskovec,et al.  Patterns of temporal variation in online media , 2011, WSDM '11.

[23]  Donald L. Snyder,et al.  Random Point Processes in Time and Space , 1991 .

[24]  Christos Faloutsos,et al.  Rise and fall patterns of information diffusion: model and implications , 2012, KDD.

[25]  Le Song,et al.  Shaping Social Activity by Incentivizing Users , 2014, NIPS.

[26]  Lada A. Adamic,et al.  The Anatomy of Large Facebook Cascades , 2013, ICWSM.

[27]  E. Rogers,et al.  Diffusion of innovations , 1964, Encyclopedia of Sport Management.

[28]  George E. Tita,et al.  Self-Exciting Point Process Modeling of Crime , 2011 .

[29]  Rediet Abebe Can Cascades be Predicted? , 2014 .

[30]  Duncan J. Watts,et al.  Everyone's an influencer: quantifying influence on twitter , 2011, WSDM '11.

[31]  Albert-László Barabási,et al.  The origin of bursts and heavy tails in human dynamics , 2005, Nature.

[32]  Bernhard Schölkopf,et al.  Structure and dynamics of information pathways in online media , 2012, WSDM.

[33]  Bernardo A. Huberman,et al.  Predicting the popularity of online content , 2008, Commun. ACM.

[34]  Bernardo A. Huberman,et al.  The Pulse of News in Social Media: Forecasting Popularity , 2012, ICWSM.

[35]  A. Hawkes Spectra of some self-exciting and mutually exciting point processes , 1971 .

[36]  Le Song,et al.  Learning Networks of Heterogeneous Influence , 2012, NIPS.

[37]  Padhraic Smyth,et al.  Dynamic Egocentric Models for Citation Networks , 2011, ICML.