Nonlinear Dynamics of Information Diffusion in Social Networks

The recent explosion in the adoption of search engines and new media such as blogs and Twitter have facilitated the faster propagation of news and rumors. How quickly does a piece of news spread over these media? How does its popularity diminish over time? Does the rising and falling pattern follow a simple universal law? In this article, we propose SpikeM, a concise yet flexible analytical model of the rise and fall patterns of information diffusion. Our model has the following advantages. First, unification power: it explains earlier empirical observations and generalizes theoretical models including the SI and SIR models. We provide the threshold of the take-off versus die-out conditions for SpikeM and discuss the generality of our model by applying it to an arbitrary graph topology. Second, practicality: it matches the observed behavior of diverse sets of real data. Third, parsimony: it requires only a handful of parameters. Fourth, usefulness: it makes it possible to perform analytic tasks such as forecasting, spotting anomalies, and interpretation by reverse engineering the system parameters of interest (quality of news, number of interested bloggers, etc.). We also introduce an efficient and effective algorithm for the real-time monitoring of information diffusion, namely SpikeStream, which identifies multiple diffusion patterns in a large collection of online event streams. Extensive experiments on real datasets demonstrate that SpikeM accurately and succinctly describes all patterns of the rise and fall spikes in social networks.

[1]  Jimeng Sun,et al.  Beyond streams and graphs: dynamic tensor analysis , 2006, KDD '06.

[2]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[3]  Christos Faloutsos,et al.  Modeling Blog Dynamics , 2009, ICWSM.

[4]  Dimitrios Gunopulos,et al.  Elastic Translation Invariant Matching of Trajectories , 2005, Machine Learning.

[5]  Eamonn J. Keogh,et al.  Mining motifs in massive time series databases , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[6]  Christos Faloutsos,et al.  Non-Linear Mining of Competing Local Activities , 2016, WWW.

[7]  Christos Faloutsos,et al.  Winner takes all: competing viruses or ideas on fair-play networks , 2012, WWW.

[8]  Tudor Dumitras,et al.  SharkFin: Spatio-temporal mining of software adoption and penetration , 2014, Social Network Analysis and Mining.

[9]  Christos Faloutsos,et al.  Parsimonious linear fingerprinting for time series , 2010, Proc. VLDB Endow..

[10]  Ramanathan V. Guha,et al.  Propagation of trust and distrust , 2004, WWW '04.

[11]  Jimeng Sun,et al.  Streaming Pattern Discovery in Multiple Time-Series , 2005, VLDB.

[12]  Sakurai Yasushi,et al.  Mining and Forecasting of Big Time-Series Data , 2015, 2019 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops).

[13]  Ramanathan V. Guha,et al.  The predictive power of online chatter , 2005, KDD '05.

[14]  Wei Chen,et al.  Efficient influence maximization in social networks , 2009, KDD.

[15]  F. Brauer,et al.  Mathematical Models in Population Biology and Epidemiology , 2001 .

[16]  O. Khan,et al.  ACM Transactions on Embedded Computing Systems continued on back cover , 2018 .

[17]  Christos Faloutsos,et al.  DynaMMo: mining and summarization of coevolving sequences with missing values , 2009, KDD.

[18]  Christos Faloutsos,et al.  Mining Big Time-series Data on the Web , 2016, WWW.

[19]  Jure Leskovec,et al.  Finding progression stages in time-evolving event sequences , 2014, WWW.

[20]  Machiko Toyoda,et al.  Pattern discovery in data streams under the time warping distance , 2012, The VLDB Journal.

[21]  Jilles Vreeken,et al.  The long and the short of it: summarising event sequences with serial episodes , 2012, KDD.

[22]  Yehuda Koren,et al.  Care to comment?: recommendations for commenting on news stories , 2012, WWW.

[23]  Srinivasan Parthasarathy,et al.  Economically-efficient sentiment stream analysis , 2014, SIGIR.

[24]  Christos Faloutsos,et al.  FUNNEL: automatic mining of spatially coevolving epidemics , 2014, KDD.

[25]  S. Muthukrishnan,et al.  Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries , 2001, VLDB.

[26]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[27]  S. Klepper Entry, Exit, Growth, and Innovation over the Product Life Cycle , 1996 .

[28]  Bruno Ribeiro,et al.  Revisit Behavior in Social Media: The Phoenix-R Model and Discoveries , 2014, ECML/PKDD.

[29]  Fei Wang,et al.  Cascading outbreak prediction in networks: a data-driven approach , 2013, KDD.

[30]  Christian Böhm,et al.  Outlier-robust clustering using independent components , 2008, SIGMOD Conference.

[31]  Yasushi Sakurai,et al.  Regime Shifts in Streams: Real-time Forecasting of Co-evolving Time Sequences , 2016, KDD.

[32]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[33]  Jure Leskovec,et al.  Learning to Discover Social Circles in Ego Networks , 2012, NIPS.

[34]  Christos Faloutsos,et al.  Adaptive, Hands-Off Stream Mining , 2003, VLDB.

[35]  H. Varian,et al.  Predicting the Present with Google Trends , 2009 .

[36]  Christos Faloutsos,et al.  AutoPlait: automatic mining of co-evolving time sequences , 2014, SIGMOD Conference.

[37]  Philip S. Yu,et al.  Suppressing model overfitting in mining concept-drifting data streams , 2006, KDD '06.

[38]  Kyomin Jung,et al.  Prominent Features of Rumor Propagation in Online Social Media , 2013, 2013 IEEE 13th International Conference on Data Mining.

[39]  Didier Sornette,et al.  Robust dynamic classes revealed by measuring the response function of a social system , 2008, Proceedings of the National Academy of Sciences.

[40]  Yue Lu,et al.  Exploiting social context for review quality prediction , 2010, WWW '10.

[41]  Michalis Faloutsos,et al.  Threshold conditions for arbitrary cascade models on arbitrary networks , 2011, 2011 IEEE 11th International Conference on Data Mining.

[42]  Christos Faloutsos,et al.  Kronecker Graphs: An Approach to Modeling Networks , 2008, J. Mach. Learn. Res..

[43]  Christos Faloutsos,et al.  Monitoring Network Evolution using MDL , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[44]  Virgílio A. F. Almeida,et al.  Finding trendsetters in information networks , 2012, KDD.

[45]  Rob J Hyndman,et al.  Forecasting Time Series With Complex Seasonal Patterns Using Exponential Smoothing , 2011 .

[46]  MAGDALINI EIRINAKI,et al.  Web mining for web personalization , 2003, TOIT.

[47]  Andreas S. Weigend,et al.  Time Series Prediction: Forecasting the Future and Understanding the Past , 1994 .

[48]  Christos Faloutsos,et al.  On the Vulnerability of Large Graphs , 2010, 2010 IEEE International Conference on Data Mining.

[49]  Tudor Dumitras,et al.  Spatio-temporal mining of software adoption & penetration , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[50]  Bruno Ribeiro,et al.  Modeling and predicting the growth and death of membership-based websites , 2013, WWW.

[51]  Philip S. Yu,et al.  Optimal Distance Bounds on Time-Series Data , 2009, SDM.

[52]  Christos Faloutsos,et al.  Mining and Forecasting of Big Time-Series Data , 2015, 2019 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops).

[53]  A. Weigend,et al.  Time Series Prediction: Forecasting the Future and Understanding the Past , 1994 .

[54]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[55]  Dimitrios Gunopulos,et al.  Embedding-based subsequence matching in time-series databases , 2011, TODS.

[56]  H. Stanley,et al.  Quantifying Trading Behavior in Financial Markets Using Google Trends , 2013, Scientific Reports.

[57]  H. Varian,et al.  Predicting the Present with Google Trends , 2012 .

[58]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[59]  Lei Chen,et al.  On The Marriage of Lp-norms and Edit Distance , 2004, VLDB.

[60]  Jon M. Kleinberg,et al.  Bursty and Hierarchical Structure in Streams , 2002, Data Mining and Knowledge Discovery.

[61]  Christos Faloutsos,et al.  Patterns of Cascading Behavior in Large Blog Graphs , 2007, SDM.

[62]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1972 .

[63]  Edward Y. Chang,et al.  Adaptive stream resource management using Kalman Filters , 2004, SIGMOD '04.

[64]  Ian Davidson,et al.  Network discovery via constrained tensor analysis of fMRI data , 2013, KDD.

[65]  Masatoshi Yoshikawa,et al.  Scalable Algorithms for Distribution Search , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[66]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[67]  S. Muthukrishnan,et al.  Modeling skew in data streams , 2006, SIGMOD Conference.

[68]  Jie Tang,et al.  Mining structural hole spanners through information diffusion in social networks , 2013, WWW.

[69]  Christos Faloutsos,et al.  BRAID: stream mining through group lag correlations , 2005, SIGMOD '05.

[70]  A. Hawkes,et al.  A cluster process representation of a self-exciting process , 1974, Journal of Applied Probability.

[71]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[72]  Jae-Gil Lee,et al.  Trajectory clustering: a partition-and-group framework , 2007, SIGMOD '07.

[73]  Jessica Lin,et al.  Visually mining and monitoring massive time series , 2004, KDD.

[74]  Dimitrios Gunopulos,et al.  Finding effectors in social networks , 2010, KDD.

[75]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[76]  Herbert W. Hethcote,et al.  The Mathematics of Infectious Diseases , 2000, SIAM Rev..

[77]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[78]  References , 1971 .

[79]  David M. Pennock,et al.  Predicting consumer behavior with Web search , 2010, Proceedings of the National Academy of Sciences.

[80]  Kenneth Levenberg A METHOD FOR THE SOLUTION OF CERTAIN NON – LINEAR PROBLEMS IN LEAST SQUARES , 1944 .

[81]  A. J. Hall Infectious diseases of humans: R. M. Anderson & R. M. May. Oxford etc.: Oxford University Press, 1991. viii + 757 pp. Price £50. ISBN 0-19-854599-1 , 1992 .

[82]  Christos Faloutsos,et al.  The Web as a Jungle: Non-Linear Dynamical Systems for Co-evolving Online Activities , 2015, WWW.

[83]  E. A. Jackson,et al.  Perspectives of nonlinear dynamics , 1990 .

[84]  Christos Faloutsos,et al.  Fully automatic cross-associations , 2004, KDD.

[85]  Naonori Ueda,et al.  Fast and Exact Monitoring of Co-Evolving Data Streams , 2014, 2014 IEEE International Conference on Data Mining.

[86]  Ramanathan V. Guha,et al.  Information diffusion through blogspace , 2004, WWW '04.

[87]  Michalis Faloutsos,et al.  Threshold Conditions for Arbitrary Cascade Models on Arbitrary Networks , 2011, IEEE ICDM 2011.

[88]  Christos Faloutsos,et al.  Stream Monitoring under the Time Warping Distance , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[89]  Ambuj K. Singh,et al.  Efficient Index Structures for String Databases , 2001, VLDB.

[90]  Jure Leskovec,et al.  Patterns of temporal variation in online media , 2011, WSDM '11.

[91]  Tamer Kahveci,et al.  An Efficient Index Structure for String Databases , 2001 .

[92]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[93]  Philip S. Yu,et al.  Optimal multi-scale patterns in time series streams , 2006, SIGMOD Conference.

[94]  Jure Leskovec,et al.  Modeling Information Diffusion in Implicit Networks , 2010, 2010 IEEE International Conference on Data Mining.

[95]  R. Axelrod,et al.  Evolutionary Dynamics , 2004 .

[96]  Christos Faloutsos,et al.  Rise and fall patterns of information diffusion: model and implications , 2012, KDD.

[97]  Christos Faloutsos,et al.  F4: large-scale automated forecasting using fractals , 2002, CIKM '02.

[98]  Christos Faloutsos,et al.  Prediction and indexing of moving objects with unknown motion patterns , 2004, SIGMOD '04.

[99]  Christos Faloutsos,et al.  FTW: fast similarity search under the time warping distance , 2005, PODS.

[100]  Dennis Shasha,et al.  Efficient elastic burst detection in data streams , 2003, KDD '03.

[101]  Christos Faloutsos,et al.  Fast mining and forecasting of complex time-stamped events , 2012, KDD.

[102]  Christos Faloutsos,et al.  Spotting Culprits in Epidemics: How Many and Which Ones? , 2012, 2012 IEEE 12th International Conference on Data Mining.

[103]  Antonio Ortega,et al.  Ups and Downs in Buzzes: Life Cycle Modeling for Temporal Pattern Discovery , 2014, 2014 IEEE International Conference on Data Mining.

[104]  Dimitrios Gunopulos,et al.  Indexing Large Human-Motion Databases , 2004, VLDB.

[105]  Charu C. Aggarwal,et al.  The setwise stream classification problem , 2014, KDD.

[106]  Suman Nath,et al.  ThermoCast: a cyber-physical forecasting model for datacenters , 2011, KDD.

[107]  Christos Faloutsos,et al.  F-Trail: Finding Patterns in Taxi Trajectories , 2013, PAKDD.

[108]  Nish Parikh,et al.  Scalable and near real-time burst detection from eCommerce queries , 2008, KDD.

[109]  W. Leontief Input-output economics , 1967 .

[110]  Eamonn J. Keogh,et al.  An online algorithm for segmenting time series , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[111]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.

[112]  Ravi Kumar,et al.  Dynamics of conversations , 2010, KDD.

[113]  Albert-László Barabási,et al.  The origin of bursts and heavy tails in human dynamics , 2005, Nature.

[114]  Frank M. Bass,et al.  A New Product Growth for Model Consumer Durables , 2004, Manag. Sci..

[115]  Christos Faloutsos,et al.  Interacting viruses in networks: can both survive? , 2012, KDD.

[116]  Devavrat Shah,et al.  Rumors in a Network: Who's the Culprit? , 2009, IEEE Transactions on Information Theory.

[117]  Haixun Wang,et al.  Finding semantics in time series , 2011, SIGMOD '11.

[118]  Christos Faloutsos,et al.  Finding patterns in blog shapes and blog evolution , 2007, ICWSM.