Pareto cascade modeling of diffusion networks

Time plays an essential role in the diffusion of information, influence and disease over networks. Usually we are only able to collect cascade data in which an infection (receiving) time of each node is recorded but without any transmission information over the network. In this paper, we infer the transmission rates among nodes by Pareto distributions. Pareto modeling has several advantages. It is naturally motivated and has a nice interpretation. The scale parameter of a Pareto distribution naturally fits into the starting time of a transition, i.e., the infection time of a parent node in the cascade data is the starting point for a transition from the parent to its receiver. The shape parameter (alpha) serves as the transition rate. The larger the alpha is, the faster the transition is and there is a higher probability for disease or information to spread in a short time period. Pareto modeling is mathematically simple and computationally easy. It has explicit solutions for the optimization problem that maximizes time-dependent pairwise transmission likelihoods between all pairs of nodes. We present three modelings with a common transmission rate, with different transmission rates and with different infection rates. Experiments on real and synthetic data show that our models accurately estimate the transmission rates and perform better than the existing method.

[1]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[2]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[3]  J. Wallinga,et al.  Different Epidemic Curves for Severe Acute Respiratory Syndrome Reveal Similar Impacts of Control Measures , 2004, American journal of epidemiology.

[4]  Le Song,et al.  Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm , 2014, ICML.

[5]  Bernhard Schölkopf,et al.  Uncovering the Temporal Dynamics of Diffusion Networks , 2011, ICML.

[6]  Hongyuan Zha,et al.  Back to the Past: Source Identification in Diffusion Networks from Partially Observed Cascades , 2015, AISTATS.

[7]  D. Watts,et al.  Influentials, Networks, and Public Opinion Formation , 2007 .

[8]  G. Gibson Markov Chain Monte Carlo Methods for Fitting Spatiotemporal Stochastic Models in Plant Epidemiology , 1997 .

[9]  Jure Leskovec,et al.  On the Convexity of Latent Social Network Inference , 2010, NIPS.

[10]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[11]  T. Geisel,et al.  The scaling laws of human travel , 2006, Nature.

[12]  Xiaoming Zhang,et al.  Inferring Diffusion Networks with Sparse Cascades by Structure Transfer , 2015, DASFAA.

[13]  Jure Leskovec,et al.  Inferring networks of diffusion and influence , 2010, KDD.

[14]  Bernhard Schölkopf,et al.  Structure and dynamics of information pathways in online media , 2012, WSDM.

[15]  Ping Yan,et al.  Distribution Theory, Stochastic Processes and Infectious Disease Modelling , 2008, Mathematical Epidemiology.

[16]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[17]  Lada A. Adamic,et al.  Tracking information epidemics in blogspace , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[18]  Leonhard Held,et al.  Power-law models for infectious disease spread , 2013, 1308.5115.