CoreCluster: A Degeneracy Based Graph Clustering Framework

Graph clustering or community detection constitutes an important task for investigating the internal structure of graphs, with a plethora of applications in several domains. Traditional tools for graph clustering, such as spectral methods, typically suffer from high time and space complexity. In this article, we present CORECLUSTER, an efficient graph clustering framework based on the concept of graph degeneracy, that can be used along with any known graph clustering algorithm. Our approach capitalizes on processing the graph in a hierarchical manner provided by its core expansion sequence, an ordered partition of the graph into different levels according to the k-core decomposition. Such a partition provides a way to process the graph in an incremental manner that preserves its clustering structure, while making the execution of the chosen clustering algorithm much faster due to the smaller size of the graph's partitions onto which the algorithm operates.

[1]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Srinivasan Parthasarathy,et al.  Local graph sparsification for scalable clustering , 2011, SIGMOD '11.

[3]  Srinivasan Parthasarathy,et al.  Extracting Analyzing and Visualizing Triangle K-Core Motifs within Networks , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[4]  Mason A. Porter,et al.  Social Structure of Facebook Networks , 2011, ArXiv.

[5]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[6]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[7]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[8]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[9]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[10]  Christos Faloutsos,et al.  Sampling from large graphs , 2006, KDD '06.

[11]  Alan M. Frieze,et al.  Clustering Large Graphs via the Singular Value Decomposition , 2004, Machine Learning.

[12]  Ameet Talwalkar,et al.  On sampling-based approximate spectral decomposition , 2009, ICML '09.

[13]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[14]  George Karypis,et al.  Multilevel algorithms for partitioning power-law graphs , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[15]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Alessandro Vespignani,et al.  K-core Decomposition: a Tool for the Visualization of Large Scale Networks , 2005, ArXiv.

[17]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Tanya Y. Berger-Wolf,et al.  Sampling community structure , 2010, WWW '10.

[19]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[20]  Pietro Perona,et al.  Grouping and dimensionality reduction by locally linear embedding , 2001, NIPS.

[21]  David F. Gleich,et al.  Vertex neighborhoods, low conductance cuts, and good seeds for local community methods , 2012, KDD.

[22]  James Cheng,et al.  Efficient core decomposition in massive networks , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[23]  Kumar Chellapilla,et al.  Finding Dense Subgraphs with Size Bounds , 2009, WAW.

[24]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[27]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[28]  Padhraic Smyth,et al.  A Spectral Clustering Approach To Finding Communities in Graph , 2005, SDM.

[29]  Srinivasan Parthasarathy,et al.  Scalable graph clustering using stochastic flows: applications to community discovery , 2009, KDD.