Interestingness-Driven Diffusion Process Summarization in Dynamic Networks

The widespread use of social networks enables the rapid diffusion of information, e.g., news, among users in very large communities. It is a substantial challenge to be able to observe and understand such diffusion processes, which may be modeled as networks that are both large and dynamic. A key tool in this regard is data summarization. However, few existing studies aim to summarize graphs/networks for dynamics. Dynamic networks raise new challenges not found in static settings, including time sensitivity and the needs for online interestingness evaluation and summary traceability, which render existing techniques inapplicable. We study the topic of dynamic network summarization: how to summarize dynamic networks with millions of nodes by only capturing the few most interesting nodes or edges over time, and we address the problem by finding interestingness-driven diffusion processes. Based on the concepts of diffusion radius and scope, we define interestingness measures for dynamic networks, and we propose OSNet, an online summarization framework for dynamic networks. We report on extensive experiments with both synthetic and real-life data. The study offers insight into the effectiveness and design properties of OSNet.

[1]  James Bailey,et al.  On compressing weighted time-evolving graphs , 2012, CIKM.

[2]  Philip S. Yu,et al.  GraphScope: parameter-free mining of large time-evolving graphs , 2007, KDD '07.

[3]  Qiang Qu,et al.  A direct mining approach to efficient constrained graph pattern discovery , 2013, SIGMOD '13.

[4]  Christos Faloutsos,et al.  Patterns of Cascading Behavior in Large Blog Graphs , 2007, SDM.

[5]  Torben Bach Pedersen,et al.  Integrated Data Management for Mobile Services in the Real World , 2003, VLDB.

[6]  Philip S. Yu,et al.  Efficient Topological OLAP on Information Networks , 2011, DASFAA.

[7]  Jure Leskovec,et al.  Can cascades be predicted? , 2014, WWW.

[8]  Krishna P. Gummadi,et al.  A measurement-driven analysis of information propagation in the flickr social network , 2009, WWW '09.

[9]  Ramayya Krishnan,et al.  Adaptive collective routing using gaussian process dynamic congestion models , 2013, KDD.

[10]  Christos Faloutsos,et al.  Monitoring Network Evolution using MDL , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[11]  Sebastiano Vigna,et al.  The webgraph framework I: compression techniques , 2004, WWW '04.

[12]  Ramayya Krishnan,et al.  HYDRA: large-scale social identity linkage via heterogeneous behavior modeling , 2014, SIGMOD Conference.

[13]  Aisling Kelliher,et al.  Summarization of social activity over time: people, actions and concepts in dynamic networks , 2008, CIKM '08.

[14]  Jean-François Boulicaut,et al.  Trend Mining in Dynamic Attributed Graphs , 2013, ECML/PKDD.

[15]  W. Marsden I and J , 2012 .

[16]  Siyuan Liu,et al.  Towards mobility-based clustering , 2010, KDD.

[17]  Philip S. Yu,et al.  Generative Models for Evolutionary Clustering , 2012, TKDD.

[18]  Jan Komorowski,et al.  Principles of Data Mining and Knowledge Discovery , 2001, Lecture Notes in Computer Science.

[19]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[20]  Xin Wang,et al.  Query preserving graph compression , 2012, SIGMOD Conference.

[21]  Scott Counts,et al.  Predicting the Speed, Scale, and Range of Information Diffusion in Twitter , 2010, ICWSM.

[22]  Fang Zhou,et al.  Compression of weighted graphs , 2011, KDD.

[23]  Jimeng Sun,et al.  Less is More: Sparse Graph Mining with Compact Matrix Decomposition , 2008, Stat. Anal. Data Min..

[24]  Jignesh M. Patel,et al.  Efficient aggregation for graph summarization , 2008, SIGMOD Conference.

[25]  Chen Lin,et al.  CLEar: A Real-time Online Observatory for Bursty and Viral Events , 2014, Proc. VLDB Endow..

[26]  Nisheeth Shrivastava,et al.  Graph summarization with bounded error , 2008, SIGMOD Conference.

[27]  Christos Faloutsos,et al.  Graph Mining: Laws, Tools, and Case Studies , 2012, Synthesis Lectures on Data Mining and Knowledge Discovery.

[28]  Sriram Raghavan,et al.  Representing Web graphs , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[29]  Philip S. Yu,et al.  Mining top-K large structural patterns in a massive network , 2011, Proc. VLDB Endow..

[30]  Ravi Kumar,et al.  Structure and evolution of online social networks , 2006, KDD '06.