Ultrametricity of Information Cascades

Whether it is the inter-arrival time between two consecutive votes on a story on Reddit or the comments on a video shared on Youtube, there is always a hierarchy of time scales in information propagation. One vote/comment might occur almost simultaneously with the previous, whereas another vote/comment might occur hours after the preceding one. This hierarchy of time scales leads us to believe that information cascades can be modeled using ultrametricity and ultradiffusion.This paper reports an investigation into cascades of information flow underlying Reddit, Youtube and Digg. An information cascade represents the spread of information from one node to his friends, from friends to their friends of friends and so on. It might be impossible to completely perceive the entire process of information flow as some of the data pertaining to it might be hidden or inaccessible to us. However, we might be able to observe some counting process which is a consequence of this diffusion. For example, in Digg this counting process might be the temporal variation in the number of votes accrued by a story. In Youtube, it might be the number of comments received by a video with time.We study the dynamics of these votes and comments to better understand information spread.Our observations can be described by a universal function whose parameters depend upon the system under consideration. This function can be derived by using ultrametricity to describe the propagation. The parameters for the ultradiffusion process are learned from the actual observations. We demonstrate that the results predicted by simulating the ultradiffusion process are in close correspondence to the actual observations.

[1]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[2]  B A Huberman,et al.  Ultradiffusion: the relaxation of hierarchical systems , 1985 .

[3]  Albert-László Barabási,et al.  The origin of bursts and heavy tails in human dynamics , 2005, Nature.

[4]  Bernhard Schölkopf,et al.  Structure and dynamics of information pathways in online media , 2012, WSDM.

[5]  Bernardo A. Huberman,et al.  Predicting the popularity of online content , 2008, Commun. ACM.

[6]  Jeffrey O. Kephart,et al.  Directed-graph epidemiological models of computer viruses , 1991, Proceedings. 1991 IEEE Computer Society Symposium on Research in Security and Privacy.

[7]  C. Bachas,et al.  Percolation and the complexity of games , 1987 .

[8]  Damon Centola,et al.  The Spread of Behavior in an Online Social Network Experiment , 2010, Science.

[9]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[10]  Bernardo A. Huberman,et al.  Complexity and ultradiffusion , 1987 .

[11]  Adilson E. Motter,et al.  A Poissonian explanation for heavy tails in e-mail communication , 2008, Proceedings of the National Academy of Sciences.

[12]  Claudio Castellano,et al.  Thresholds for epidemic spreading in networks , 2010, Physical review letters.

[13]  Didier Sornette,et al.  Download relaxation dynamics on the WWW following newspaper publication of URL , 2000 .

[14]  M E J Newman,et al.  Identity and Search in Social Networks , 2002, Science.

[15]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[16]  H. Hethcote,et al.  An immunization model for a heterogeneous population. , 1978, Theoretical population biology.

[17]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[18]  Fei Wang,et al.  Cascading outbreak prediction in networks: a data-driven approach , 2013, KDD.

[19]  Kristina Lerman,et al.  What Stops Social Epidemics? , 2011, ICWSM.

[20]  Alessandro Vespignani,et al.  Epidemic spreading in scale-free networks. , 2000, Physical review letters.

[21]  Filippo Menczer,et al.  Virality Prediction and Community Structure in Social Networks , 2013, Scientific Reports.