Community Detection in Graph Streams by Pruning Zombie Nodes

Detecting communities in graph streams has attracted a large amount of attention recently. Although many algorithms have been developed from different perspectives, there is still a limitation to the existing methods, that is, most of them neglect the “zombie” nodes (or unimportant nodes) in the graph stream which may badly affect the community detection result. In this paper, we aim to deal with the zombie nodes in networks so as to enhance the robustness of the detected communities. The key here is to design a pruning strategy to remove unimportant nodes and preserve the important nodes. We propose to recognize the zombie nodes by a degree centrality calculated from the exponential time-decaying edge weights, which can be efficiently updated in the graph stream case. Based on only important and active nodes, community kernels can be constructed, from which robust community structures can be obtained. One advantage of the proposed pruning strategy is that it is able to eliminate the effect of the aforementioned “zombie” nodes, leading to robust communities. By designing an efficient way to update the degree centrality, the important and active nodes can be easily obtained at each timestamp, leading to the reduction of computational complexity. Experiments have been conducted to show the effectiveness of the proposed method.

[1]  Philip S. Yu,et al.  NEIWalk: Community Discovery in Dynamic Content-Based Networks , 2014, IEEE Transactions on Knowledge and Data Engineering.

[2]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  Huan Liu,et al.  Community evolution in dynamic multi-mode networks , 2008, KDD.

[4]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[5]  Yun Chi,et al.  Facetnet: a framework for analyzing communities and their evolutions in dynamic networks , 2008, WWW.

[6]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[7]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Camille Roth,et al.  Natural Scales in Geographical Patterns , 2017, Scientific Reports.

[9]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Vladimir Gudkov,et al.  Community Detection in Complex Networks by Dynamical Simplex Evolution , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Tossapon Boongoen,et al.  A Link-Based Cluster Ensemble Approach for Categorical Data Clustering , 2012, IEEE Transactions on Knowledge and Data Engineering.

[12]  Konstantin Avrachenkov,et al.  Cooperative Game Theory Approaches for Network Partitioning , 2017, COCOON.

[13]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[14]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[16]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[17]  Alessandro Laio,et al.  Clustering by fast search and find of density peaks , 2014, Science.

[18]  Jiawei Han,et al.  A Particle-and-Density Based Evolutionary Clustering Method for Dynamic Networks , 2009, Proc. VLDB Endow..

[19]  Philip S. Yu,et al.  Dynamic Community Detection in Weighted Graph Streams , 2013, SDM.