A LexDFS-Based Approach on Finding Compact Communities

This article presents an efficient hierarchical clustering algorithm based on a graph traversal algorithm called LexDFS. This traversal algorithm has the property of going through the clustered parts of the graph in a small number of iterations, making them recognisable. The time complexity of our method is in O(n × log(n)). It is simple to implement and a thorough study shows that it outputs clusterings that are closer to some ground-truths than its competitors. Experiments are also carried out to analyse the behaviour of the algorithm during execution on sample graphs. This article also features a quality function called compactness, which measures how efficient is the cluster for internal communications. We prove that this quality function features interesting theoretical properties.

[1]  Breck Baldwin,et al.  Entity-Based Cross-Document Coreferencing Using the Vector Space Model , 1998, COLING.

[2]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[3]  S. Dongen Graph clustering by flow simulation , 2000 .

[4]  Santosh S. Vempala,et al.  On clusterings: Good, bad and spectral , 2004, JACM.

[5]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[6]  Michel Habib,et al.  LDFS-Based Certifying Algorithm for the Minimum Path Cover Problem on Cocomparability Graphs , 2013, SIAM J. Comput..

[7]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[8]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[9]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[10]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[11]  Sylvain Peyronnet,et al.  On the Evaluation Potential of Quality Functions in Community Detection for Different Contexts , 2015, NetSci-X.

[12]  Vincent A. Traag,et al.  Significant Scales in Community Structure , 2013, Scientific Reports.

[13]  C. Lee Giles,et al.  Efficient identification of Web communities , 2000, KDD '00.

[14]  Sanjukta Bhowmick,et al.  On the permanence of vertices in network communities , 2014, KDD.

[15]  Yifan Hu,et al.  Efficient, High-Quality Force-Directed Graph Drawing , 2006 .

[16]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[17]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[18]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[20]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[21]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[22]  Robert E. Tarjan,et al.  Efficiency of a Good But Not Linear Set Union Algorithm , 1972, JACM.

[23]  B. Jaumard,et al.  Minimum sum of diameters clustering , 1987 .

[24]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[25]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, KDD 2012.

[26]  Yiming Yang,et al.  Introducing the Enron Corpus , 2004, CEAS.

[27]  Ignacio Marín,et al.  Surprise maximization reveals the community structure of complex networks , 2013, Scientific Reports.

[28]  Derek G. Corneil,et al.  A Unified View of Graph Searching , 2008, SIAM J. Discret. Math..

[29]  Elena Marchiori,et al.  Axioms for graph clustering quality functions , 2013, J. Mach. Learn. Res..

[30]  Illés J. Farkas,et al.  CFinder: locating cliques and overlapping modules in biological networks , 2006, Bioinform..

[31]  Niloy Ganguly,et al.  Computer science fields as ground-truth communities: Their impact, rise and fall , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[32]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[33]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Sylvain Peyronnet,et al.  Finding compact communities in large graphs , 2014, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[35]  Niloy Ganguly,et al.  Citation interactions among computer science fields: a quantitative route to the rise and fall of scientific research , 2014, Social Network Analysis and Mining.

[36]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[37]  Marko Bajec,et al.  Model of complex networks based on citation dynamics , 2013, WWW.

[38]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[39]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.