Computing Communities in Large Networks Using Random Walks

Dense subgraphs of sparse graphs (communities), which appear in most real-world complex networks, play an important role in many contexts. Computing them however is generally expensive. We propose here a measure of similarity between vertices based on random walks which has several important advantages: it captures well the community structure in a network, it can be computed efficiently, and it can be used in an ag-glomerative algorithm to compute efficiently the community structure of a network. We propose such an algorithm, called Walktrap, which runs in time O(mn 2) and space O(n 2) in the worst case, and in time O(n 2 log n) and space O(n 2) in most real-world cases (n and m are respectively the number of vertices and edges in the input graph). Extensive comparison tests show that our algorithm surpasses previously proposed ones concerning the quality of the obtained community structures and that it stands among the best ones concerning the running time.

[1]  M. Cugmas,et al.  On comparing partitions , 2015 .

[2]  R. Comunian,et al.  Social Network Analysis , 2011, Sports Analytics.

[3]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[4]  B. Nadler,et al.  Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators , 2005, NIPS.

[5]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[6]  M. A. Muñoz,et al.  Improved spectral algorithm for the detection of network communities , 2005, physics/0504059.

[7]  A. Clauset Finding local community structure in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  R. Guimerà,et al.  Functional cartography of complex metabolic networks , 2005, Nature.

[9]  Ulrik Brandes,et al.  Network Analysis: Methodological Foundations (Lecture Notes in Computer Science) , 2005 .

[10]  A. Arenas,et al.  Community detection in complex networks using extremal optimization. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  E. Bollt,et al.  Local method for detecting communities. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Albert-László Barabási,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW , 2004 .

[13]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Alan M. Frieze,et al.  Clustering Large Graphs via the Singular Value Decomposition , 2004, Machine Learning.

[15]  Reinhard Lipowsky,et al.  Network Brownian Motion: A New Method to Measure Vertex-Vertex Proximity and to Identify Communities and Subcommunities , 2004, International Conference on Computational Science.

[16]  M. A. Muñoz,et al.  Detecting network communities: a new systematic and efficient algorithm , 2004, cond-mat/0404652.

[17]  V. Latora,et al.  Method to find community structures based on information centrality. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  J. Reichardt,et al.  Detecting fuzzy community structures in complex networks with a Potts model. , 2004, Physical review letters.

[19]  K. Sneppen,et al.  Diffusion on complex networks: a way to probe their large-scale topological structures , 2003, cond-mat/0312476.

[20]  Fang Wu,et al.  Finding communities in linear time: a physics approach , 2003, ArXiv.

[21]  Damien Magoni,et al.  Completeness of the Internet Core Topology Collected by a Fast Mapping Software , 2003 .

[22]  M. Newman Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Ulrik Brandes,et al.  Experiments on Graph Clustering Algorithms , 2003, ESA.

[25]  M. Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Marek Karpinski,et al.  Approximation schemes for clustering problems , 2003, STOC '03.

[27]  Fabien de Montgolfier,et al.  Un Modèle Gravitationnel du Web , 2003 .

[28]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[29]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[30]  K. Vehkalahti,et al.  Cluster Analysis , 2002, Science.

[31]  C. Lee Giles,et al.  Self-Organization and Identification of Web Communities , 2002, Computer.

[32]  David Harel,et al.  On Clustering Using Random Walks , 2001, FSTTCS.

[33]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[34]  J. Kleinberg,et al.  The Structure of the Web , 2001, Science.

[35]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[36]  S. Strogatz Exploring complex networks , 2001, Nature.

[37]  B. Gaveau,et al.  Coarse Grains: The Emergence of Space and Order , 2001, cond-mat/0102071.

[38]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[39]  S. Dongen Graph clustering by flow simulation , 2000 .

[40]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[41]  A. Lesne,et al.  Spectral signatures of hierarchical relaxation , 1999 .

[42]  M. Jambu,et al.  Cluster analysis and data analysis , 1985 .

[43]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[44]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[45]  Comparison Levine,et al.  Quantitative Applications in the Social Sciences , 2006 .

[46]  Marco Saerens,et al.  Clustering Using a Random Walk Based Distance Measure , 2005, ESANN.

[47]  B. Gaume,et al.  Balades aléatoires dans les Petits Mondes Lexicaux , 2004 .

[48]  László Lovász,et al.  Random Walks on Graphs: A Survey , 1993 .