Propagation-Based Temporal Network Summarization

Modern networks are very large in size and also evolve with time. As their sizes grow, the complexity of performing network analysis grows as well. Getting a smaller representation of a temporal network with similar properties will help in various data mining tasks. In this paper, we study the novel problem of getting a smaller diffusion-equivalent representation of a set of time-evolving networks. We first formulate a well-founded and general temporal-network condensation problem based on the so-called system-matrix of the network. We then propose NetCondense, a scalable and effective algorithm which solves this problem using careful transformations in sub-quadratic running time, and linear space complexities. Our extensive experiments show that we can reduce the size of large real temporal networks (from multiple domains such as social, co-authorship, and email) significantly without much loss of information. We also show the wide-applicability of NetCondense by leveraging it for several tasks: for example, we use it to understand, explore, and visualize the original datasets and to also speed-up algorithms for the influence-maximization and event detection problems on temporal networks.

[1]  Liangzhe Chen,et al.  SnapNETS: Automatic Segmentation of Network Sequences with Node Labels , 2017, AAAI.

[2]  V. N. Bogaevski,et al.  Matrix Perturbation Theory , 1991 .

[3]  Purnamrita Sarkar,et al.  Nonparametric Link Prediction in Dynamic Networks , 2012, ICML.

[4]  Alessandro Vespignani,et al.  Epidemic spreading in scale-free networks. , 2000, Physical review letters.

[5]  Alessandro Vespignani,et al.  EPIDEMIC SPREADING IN SCALEFREE NETWORKS , 2001 .

[6]  J. H. Wilkinson,et al.  AN ESTIMATE FOR THE CONDITION NUMBER OF A MATRIX , 1979 .

[7]  Fang Zhou,et al.  Compression of weighted graphs , 2011, KDD.

[8]  Yao Zhang,et al.  DAVA: Distributing Vaccines over Networks under Prior Information , 2014, SDM.

[9]  Alain Barrat,et al.  Contact Patterns among High School Students , 2014, PloS one.

[10]  Michalis Faloutsos,et al.  Threshold conditions for arbitrary cascade models on arbitrary networks , 2011, 2011 IEEE 11th International Conference on Data Mining.

[11]  Leman Akoglu,et al.  Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs , 2015, SDM.

[12]  Philip S. Yu,et al.  On Influential Node Discovery in Dynamic Social Networks , 2012, SDM.

[13]  James Bailey,et al.  On compressing weighted time-evolving graphs , 2012, CIKM.

[14]  E. Rogers,et al.  Diffusion of innovations , 1964, Encyclopedia of Sport Management.

[15]  Yao Zhang,et al.  Fast influence-based coarsening for large networks , 2014, KDD.

[16]  G. Stewart,et al.  Matrix Perturbation Theory , 1990 .

[17]  Evaggelia Pitoura,et al.  Diffusion Maximization in Evolving Social Networks , 2015, COSN.

[18]  Aristides Gionis,et al.  Sparsification of influence networks , 2011, KDD.

[19]  Fuzhen Zhang Matrix Theory: Basic Results and Techniques , 1999 .

[20]  Danai Koutra,et al.  TimeCrunch: Interpretable Dynamic Graph Summarization , 2015, KDD.

[21]  A. M'Kendrick Applications of Mathematics to Medical Problems , 1925, Proceedings of the Edinburgh Mathematical Society.

[22]  F. Gantmacher,et al.  Applications of the theory of matrices , 1960 .

[23]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[24]  Donald F. Towsley,et al.  The effect of network topology on the spread of epidemics , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[25]  Christos Faloutsos,et al.  Interestingness-Driven Diffusion Process Summarization in Dynamic Networks , 2014, ECML/PKDD.

[26]  Jignesh M. Patel,et al.  Discovery-driven graph summarization , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[27]  Nisheeth Shrivastava,et al.  Graph summarization with bounded error , 2008, SIGMOD Conference.

[28]  Herbert W. Hethcote,et al.  The Mathematics of Infectious Diseases , 2000, SIAM Rev..

[29]  Tanya Y. Berger-Wolf,et al.  Finding Communities in Dynamic Social Networks , 2011, 2011 IEEE 11th International Conference on Data Mining.

[30]  Lada A. Adamic,et al.  Power-Law Distribution of the World Wide Web , 2000, Science.

[31]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[32]  N. Bailey The mathematical theory of epidemics , 1957 .

[33]  Christos Faloutsos,et al.  Epidemic thresholds in real networks , 2008, TSEC.

[34]  Michalis Faloutsos,et al.  Virus Propagation on Time-Varying Networks: Theory and Immunization Algorithms , 2010, ECML/PKDD.

[35]  Jeffrey O. Kephart,et al.  Measuring and modeling computer virus prevalence , 1993, Proceedings 1993 IEEE Computer Society Symposium on Research in Security and Privacy.

[36]  Charu C. Aggarwal,et al.  Evolutionary Network Analysis , 2014, ACM Comput. Surv..

[37]  Alessandro Vespignani,et al.  Time varying networks and the weakness of strong ties , 2013, Scientific Reports.

[38]  Julie Fournet,et al.  Data on face-to-face contacts in an office building suggest a low-cost vaccination strategy based on community linkers , 2014, Network Science.

[39]  Michalis Faloutsos,et al.  Gelling, and melting, large graphs by edge manipulation , 2012, CIKM.

[40]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.