Efficient Online Summarization of Large-Scale Dynamic Networks

Information diffusion in social networks is often characterized by huge participating communities and viral cascades of high dynamicity. To observe, summarize, and understand the evolution of dynamic diffusion processes in an informative and insightful way is a challenge of high practical value. However, few existing studies aim to summarize networks for interesting dynamic patterns. Dynamic networks raise new challenges not found in static settings, including time sensitivity, online interestingness evaluation, and summary traceability, which render existing techniques inadequate. We propose dynamic network summarization to summarize dynamic networks with millions of nodes by only capturing the few most interesting nodes or edges overtime. Based on the concepts of diffusion radius and scope, we define interestingness measures for dynamic networks, and we propose OSNet, an online summarization framework for dynamic networks. Efficient algorithms are included in OSNet. We report on extensive experiments with both synthetic and real-life data. The study offers insight into the effectiveness, efficiency, and design properties of OSNet.

[1]  Christian S. Jensen,et al.  Space-Time Aware Behavioral Topic Modeling for Microblog Posts , 2015, IEEE Data Eng. Bull..

[2]  Nisheeth Shrivastava,et al.  Graph summarization with bounded error , 2008, SIGMOD Conference.

[3]  Bernhard Schölkopf,et al.  Uncovering the structure and temporal dynamics of information propagation , 2014, Network Science.

[4]  Bernhard Schölkopf,et al.  Structure and dynamics of information pathways in online media , 2012, WSDM.

[5]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[6]  Christos Faloutsos,et al.  Graph Mining: Laws, Tools, and Case Studies , 2012, Synthesis Lectures on Data Mining and Knowledge Discovery.

[7]  Philip S. Yu,et al.  GraphScope: parameter-free mining of large time-evolving graphs , 2007, KDD '07.

[8]  Philip S. Yu,et al.  Mining top-K large structural patterns in a massive network , 2011, Proc. VLDB Endow..

[9]  Philip S. Yu,et al.  Efficient Topological OLAP on Information Networks , 2011, DASFAA.

[10]  Scott Counts,et al.  Predicting the Speed, Scale, and Range of Information Diffusion in Twitter , 2010, ICWSM.

[11]  Siyuan Liu,et al.  Distributed Incomplete Pattern Matching via a Novel Weighted Bloom Filter , 2012, 2012 IEEE 32nd International Conference on Distributed Computing Systems.

[12]  Philip S. Yu,et al.  Generative Models for Evolutionary Clustering , 2012, TKDD.

[13]  Krishna P. Gummadi,et al.  A measurement-driven analysis of information propagation in the flickr social network , 2009, WWW '09.

[14]  John H. Reif,et al.  Efficient lossless compression of trees and graphs , 1996, Proceedings of Data Compression Conference - DCC '96.

[15]  Chen Lin,et al.  CLEar: A Real-time Online Observatory for Bursty and Viral Events , 2014, Proc. VLDB Endow..

[16]  Qiang Qu,et al.  A direct mining approach to efficient constrained graph pattern discovery , 2013, SIGMOD '13.

[17]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[18]  James Bailey,et al.  On compressing weighted time-evolving graphs , 2012, CIKM.

[19]  D. Meadows-Klue The Tipping Point: How Little Things Can Make a Big Difference , 2004 .

[20]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[21]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[22]  Jure Leskovec,et al.  Finding progression stages in time-evolving event sequences , 2014, WWW.

[23]  Christos Faloutsos,et al.  SlashBurn: Graph Compression and Mining beyond Caveman Communities , 2014, IEEE Transactions on Knowledge and Data Engineering.

[24]  Torben Bach Pedersen,et al.  Integrated Data Management for Mobile Services in the Real World , 2003, VLDB.

[25]  Christos Faloutsos,et al.  Monitoring Network Evolution using MDL , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[26]  Siyuan Liu,et al.  Towards mobility-based clustering , 2010, KDD.

[27]  Hongyuan Zha,et al.  Back to the Past: Source Identification in Diffusion Networks from Partially Observed Cascades , 2015, AISTATS.

[28]  Nectaria Tryfona,et al.  Location-based services: A database perspective , 2001, ScanGIS.

[29]  Rediet Abebe Can Cascades be Predicted? , 2014 .

[30]  Christos Faloutsos,et al.  Patterns of Cascading Behavior in Large Blog Graphs , 2007, SDM.

[31]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[32]  Piotr Indyk,et al.  Comparing Data Streams Using Hamming Norms (How to Zero In) , 2002, VLDB.

[33]  Xin Wang,et al.  Query preserving graph compression , 2012, SIGMOD Conference.

[34]  Bernhard Schölkopf,et al.  Uncovering the Temporal Dynamics of Diffusion Networks , 2011, ICML.

[35]  Aisling Kelliher,et al.  Summarization of social activity over time: people, actions and concepts in dynamic networks , 2008, CIKM '08.

[36]  Jimeng Sun,et al.  Less is More: Sparse Graph Mining with Compact Matrix Decomposition , 2008, Stat. Anal. Data Min..

[37]  Ramayya Krishnan,et al.  Adaptive collective routing using gaussian process dynamic congestion models , 2013, KDD.

[38]  Fang Zhou,et al.  Compression of weighted graphs , 2011, KDD.

[39]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[40]  Cécile Favre,et al.  Information diffusion in online social networks: a survey , 2013, SGMD.

[41]  Christos Faloutsos,et al.  Interestingness-Driven Diffusion Process Summarization in Dynamic Networks , 2014, ECML/PKDD.

[42]  Sriram Raghavan,et al.  Representing Web graphs , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[43]  Jure Leskovec,et al.  Inferring networks of diffusion and influence , 2010, KDD.