Progressive clustering of networks using Structure-Connected Order of Traversal

Network clustering enables us to view a complex network at the macro level, by grouping its nodes into units whose characteristics and interrelationships are easier to analyze and understand. State-of-the-art network partitioning methods are unable to identify hubs and outliers. A recently proposed algorithm, SCAN, overcomes this difficulty. However, it requires a minimum similarity parameter ɛ but provides no automated way to find it. Thus, it must be rerun for each ɛ value and does not capture the variety or hierarchy of clusters. We propose a new algorithm, SCOT (or Structure-Connected Order of Traversal), that produces a length n sequence containing all possible ɛ-clusterings. We propose a new algorithm, HintClus (or Hierarchy-Induced Network Clustering), to hierarchically cluster the network by finding only best cluster boundaries (not agglomerative). Results on model-based synthetic network data and real data show that SCOT's execution time is comparable to SCAN, that HintClus runs in negligible time, and that HintClus produces sensible clusters in the presence of noise.

[1]  Chung-Kuan Cheng,et al.  Towards efficient hierarchical designs by ratio cut partitioning , 1989, 1989 IEEE International Conference on Computer-Aided Design. Digest of Technical Papers.

[2]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[3]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[4]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[5]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[6]  Ayumu Nagai Inappropriateness of the criterion of k-way normalized cuts for deciding the number of clusters , 2007, Pattern Recognit. Lett..

[7]  R. Guimerà,et al.  Functional cartography of complex metabolic networks , 2005, Nature.

[8]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[10]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[11]  Xiaowei Xu,et al.  SCAN: a structural clustering algorithm for networks , 2007, KDD '07.

[12]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  M. Fiedler Algebraic connectivity of graphs , 1973 .

[14]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[15]  Mark S. Granovetter T H E S T R E N G T H O F WEAK TIES: A NETWORK THEORY REVISITED , 1983 .