SHRINK: a structural clustering algorithm for detecting hierarchical communities in networks

Community detection is an important task for mining the structure and function of complex networks. Generally, there are several different kinds of nodes in a network which are cluster nodes densely connected within communities, as well as some special nodes like hubs bridging multiple communities and outliers marginally connected with a community. In addition, it has been shown that there is a hierarchical structure in complex networks with communities embedded within other communities. Therefore, a good algorithm is desirable to be able to not only detect hierarchical communities, but also identify hubs and outliers. In this paper, we propose a parameter-free hierarchical network clustering algorithm SHRINK by combining the advantages of density-based clustering and modularity optimization methods. Based on the structural connectivity information, the proposed algorithm can effectively reveal the embedded hierarchical community structure with multiresolution in large-scale weighted undirected networks, and identify hubs and outliers as well. Moreover, it overcomes the sensitive threshold problem of density-based clustering algorithms and the resolution limit possessed by other modularity-based methods. To illustrate our methodology, we conduct experiments with both real-world and synthetic datasets for community detection, and compare with many other baseline methods. Experimental results demonstrate that SHRINK achieves the best performance with consistent improvements.

[1]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[4]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[5]  Padhraic Smyth,et al.  A Spectral Clustering Approach To Finding Communities in Graph , 2005, SDM.

[6]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[7]  T. Nepusz,et al.  Fuzzy communities and the concept of bridgeness in complex networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Xiaowei Xu,et al.  SCAN: a structural clustering algorithm for networks , 2007, KDD '07.

[9]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[10]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Xiaowei Xu,et al.  A Novel Similarity-Based Modularity Function for Graph Partitioning , 2007, DaWaK.

[12]  Roger Guimerà,et al.  Extracting the hierarchical organization of complex systems , 2007, Proceedings of the National Academy of Sciences.

[13]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[14]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Jiawei Han,et al.  Progressive clustering of networks using Structure-Connected Order of Traversal , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[16]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[17]  Alex Arenas,et al.  Analysis of the structure of complex networks at different resolution levels , 2007, physics/0703218.

[18]  S.,et al.  An Efficient Heuristic Procedure for Partitioning Graphs , 2022 .

[19]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[21]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[23]  Christos Faloutsos,et al.  Epidemic thresholds in real networks , 2008, TSEC.

[24]  R. Guimerà,et al.  Functional cartography of complex metabolic networks , 2005, Nature.

[25]  P. Ronhovde,et al.  Multiresolution community detection for megascale networks by information-based replica correlations. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.