Finding compact communities in large graphs

This article presents an efficient hierarchical clustering algorithm that solves the problem of core community detection. It is a variant of the standard community detection problem in which we are particularly interested in the connected core of communities. To provide a solution to this problem, we question standard definitions on communities and provide alternatives. We propose a function called compactness, designed to assess the quality of a solution to this problem. Our algorithm is based on a graph traversal algorithm, the LexDFS. The time complexity of our method is in O(n × log (n)). Experiments show that our algorithm creates highly compact clusters.

[1]  B. Jaumard,et al.  Minimum sum of diameters clustering , 1987 .

[2]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[3]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Elena Marchiori,et al.  Axioms for graph clustering quality functions , 2013, J. Mach. Learn. Res..

[6]  Yifan Hu,et al.  Efficient, High-Quality Force-Directed Graph Drawing , 2006 .

[7]  E. Levina,et al.  Community extraction for social networks , 2010, Proceedings of the National Academy of Sciences.

[8]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[9]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[11]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[12]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[13]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Jie Tang,et al.  Detecting Community Kernels in Large Social Networks , 2011, 2011 IEEE 11th International Conference on Data Mining.

[15]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[16]  Michel Habib,et al.  LDFS-Based Certifying Algorithm for the Minimum Path Cover Problem on Cocomparability Graphs , 2013, SIAM J. Comput..

[17]  Laurent Viennot,et al.  Lex-BFS and partition refinement, with applications to transitive orientation, interval graph recognition and consecutive ones testing , 2000, Theor. Comput. Sci..

[18]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[19]  Santosh S. Vempala,et al.  On clusterings: Good, bad and spectral , 2004, JACM.

[20]  Yiming Yang,et al.  Introducing the Enron Corpus , 2004, CEAS.

[21]  Derek de Solla Price,et al.  A general theory of bibliometric and other cumulative advantage processes , 1976, J. Am. Soc. Inf. Sci..

[22]  Derek G. Corneil,et al.  A Unified View of Graph Searching , 2008, SIAM J. Discret. Math..

[23]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .

[24]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[25]  Robert E. Tarjan,et al.  Efficiency of a Good But Not Linear Set Union Algorithm , 1972, JACM.

[26]  Jure Leskovec,et al.  Learning to Discover Social Circles in Ego Networks , 2012, NIPS.

[27]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[28]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .