Iterative learning of graph connectivity from partially-observed cascade samples

Graph learning is an inference problem of estimating connectivity of a graph from a collection of epidemic cascades, with many useful applications in the areas of online/offline social networks, p2p networks, computer security, and epidemiology. We consider a practical scenario when the information of cascade samples are partially observed in the independent cascade (IC) model. For the graph learning problem, we propose an efficient algorithm that solves a localized version of computationally-intractable maximum likelihood estimation through approximations in both temporal and spatial aspects. Our algorithm iterates the operations of recovering missing time logs and inferring graph connectivity, and thereby progressively improves the inference quality. We study the sample complexity, which is the number of required cascade samples to meet a given inference quality, and show that it is asymptotically close to a lower bound, thus near-order-optimal in terms of the number of nodes. We evaluate the performance of our algorithm using five real-world social networks, whose size ranges from 20 to 900, and demonstrate that our algorithm performs better than other competing algorithms in terms of accuracy while maintaining fast running time.

[1]  Jean Pouget-Abadie,et al.  Inferring Graphs from Cascades: A Sparse Recovery Framework , 2015, ICML.

[2]  Bo Zong,et al.  Inferring the Underlying Structure of Information Cascades , 2012, 2012 IEEE 12th International Conference on Data Mining.

[3]  Le Song,et al.  Learning Networks of Heterogeneous Influence , 2012, NIPS.

[4]  Laks V. S. Lakshmanan,et al.  Learning influence probabilities in social networks , 2010, WSDM '10.

[5]  Yan Liu,et al.  Not Enough Data?: Joint Inferring Multiple Diffusion Networks via Network Generation Priors , 2017, WSDM.

[6]  Le Song,et al.  Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm , 2014, ICML.

[7]  Michael Kearns,et al.  Learning from Contagion (Without Timestamps) , 2014, ICML.

[8]  Alessandro Panconesi,et al.  Trace complexity of network inference , 2013, KDD.

[9]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[10]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[11]  Aristides Gionis,et al.  Reconstructing an Epidemic Over Time , 2016, KDD.

[12]  Ryan A. Rossi,et al.  The Network Data Repository with Interactive Graph Analytics and Visualization , 2015, AAAI.

[13]  Lei Ying,et al.  Locating the contagion source in networks with partial timestamps , 2015, Data Mining and Knowledge Discovery.

[14]  Shlomo Zilberstein,et al.  Parameter Learning for Latent Network Diffusion , 2013, IJCAI.

[15]  Andrey Y. Lokhov,et al.  Reconstructing Parameters of Spreading Models from Partial Observations , 2016, NIPS.

[16]  Philip S. Yu,et al.  Collaborative Inference of Coexisting Information Diffusions , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[17]  Jure Leskovec,et al.  Inferring networks of diffusion and influence , 2010, KDD.

[18]  Lei Ying,et al.  Catch'Em All: Locating Multiple Diffusion Sources in Networks with Partial Observations , 2016, AAAI.

[19]  Yan Liu,et al.  Learning Influence Functions from Incomplete Observations , 2016, NIPS.

[20]  Yaron Singer,et al.  Learning Diffusion using Hyperparameters , 2018, ICML.

[21]  Vincent Gripon,et al.  Reconstructing a graph from path traces , 2013, 2013 IEEE International Symposium on Information Theory.

[22]  Sujay Sanghavi,et al.  Learning the graph of epidemic cascades , 2012, SIGMETRICS '12.