Graph Clustering with Density-Cut

How can we find a good graph clustering of a real-world network, that allows insight into its underlying structure and also potential functions? In this paper, we introduce a new graph clustering algorithm Dcut from a density point of view. The basic idea is to envision the graph clustering as a density-cut problem, such that the vertices in the same cluster are densely connected and the vertices between clusters are sparsely connected. To identify meaningful clusters (communities) in a graph, a density-connected tree is first constructed in a local fashion. Owing to the density-connected tree, Dcut allows partitioning a graph into multiple densely tight-knit clusters directly. We demonstrate that our method has several attractive benefits: (a) Dcut provides an intuitive criterion to evaluate the goodness of a graph clustering in a more natural and precise way; (b) Built upon the density-connected tree, Dcut allows identifying the meaningful graph clusters of densely connected vertices efficiently; (c) The density-connected tree provides a connectivity map of vertices in a graph from a local density perspective. We systematically evaluate our new clustering approach on synthetic as well as real data to demonstrate its good performance.

[1]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[2]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[3]  Jari Saramäki,et al.  Characterizing the Community Structure of Complex Networks , 2010, PloS one.

[4]  Andrew B. Kahng,et al.  New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[5]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[6]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[7]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[10]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[11]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Christian Hennig,et al.  Design of dissimilarity measures: a new dissimilarity measure between species distribution ranges , 2006 .

[13]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[14]  S. Dongen A cluster algorithm for graphs , 2000 .

[15]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[16]  Jacques van Helden,et al.  Network Analysis Tools: from biological networks to clusters and pathways , 2008, Nature Protocols.

[17]  Richard M. Leahy,et al.  An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  George Karypis,et al.  Multilevel k-way Partitioning Scheme for Irregular Graphs , 1998, J. Parallel Distributed Comput..

[19]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[20]  Robert E. Tarjan,et al.  Graph Clustering and Minimum Cut Trees , 2004, Internet Math..

[21]  T. S. Evans,et al.  Clique graphs and overlapping communities , 2010, ArXiv.

[22]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .