An Ultra-Fast Modularity-Based Graph Clustering Algorithm

In this paper, we propose a multilevel graph partitioning scheme to speed up a modularity-based graph clustering technique. The modularity-based algorithm was proposed by Newman for partitioning graphs into communities without the input of the number of clusters. The algorithm seeks to maximize a modularity measure. However, its worst-time complexity on sparse graphs is O(n), where n is the number of vertices, which can be prohibitive for many applications. The multilevel graph partitioning scheme consists of three phases: (i) reduction of the size (coarsen) of original graph by collapsing vertices and edges, (ii) partitioning the coarsened graph, and (iii) uncoarsen it to obtain a partition for the original graph. The rationale behind this strategy is to apply a computationally expensive method in a coarsened graph, i.e., with a significantly reduced number of vertices and edges. Empirical evaluation using this approach demonstrate a significant speed up of the modularity-based algorithm, keeping a good quality clusters partitioning.

[1]  George Karypis,et al.  Multilevel k-way Partitioning Scheme for Irregular Graphs , 1998, J. Parallel Distributed Comput..

[2]  Bruce Hendrickson,et al.  A Multi-Level Algorithm For Partitioning Graphs , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[3]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[4]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Jeffrey Heer,et al.  prefuse: a toolkit for interactive information visualization , 2005, CHI.

[6]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[7]  Vipin Kumar,et al.  Analysis of Multilevel Graph Partitioning , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[8]  David S. Johnson,et al.  Some Simplified NP-Complete Graph Problems , 1976, Theor. Comput. Sci..

[9]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.