Finding Community Structure with Performance Guarantees in Complex Networks

Many networks including social networks, computer networks, and biological networks are found to divide naturally into communities of densely connected individuals. Finding community structure is one of fundamental problems in network science. Since Newman's suggestion of using \emph{modularity} as a measure to qualify the goodness of community structures, many efficient methods to maximize modularity have been proposed but without a guarantee of optimality. In this paper, we propose two polynomial-time algorithms to the modularity maximization problem with theoretical performance guarantees. The first algorithm comes with a \emph{priori guarantee} that the modularity of found community structure is within a constant factor of the optimal modularity when the network has the power-law degree distribution. Despite being mainly of theoretical interest, to our best knowledge, this is the first approximation algorithm for finding community structure in networks. In our second algorithm, we propose a \emph{sparse metric}, a substantially faster linear programming method for maximizing modularity and apply a rounding technique based on this sparse metric with a \emph{posteriori approximation guarantee}. Our experiments show that the rounding algorithm returns the optimal solutions in most cases and are very scalable, that is, it can run on a network of a few thousand nodes whereas the LP solution in the literature only ran on a network of at most 235 nodes.

[1]  Alex Arenas,et al.  Analysis of the structure of complex networks at different resolution levels , 2007, physics/0703218.

[2]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[3]  Y. Nesterov Semidefinite relaxation and nonconvex quadratic optimization , 1998 .

[4]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[5]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[6]  A. Arenas,et al.  Community detection in complex networks using extremal optimization. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  P. Hansen,et al.  Column generation algorithms for exact modularity maximization in networks. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Ying Xuan,et al.  Modularity-Maximizing Graph Communities via Mathematical Programming , 2009 .

[9]  J. Reichardt,et al.  Statistical mechanics of community detection. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  S.,et al.  An Efficient Heuristic Procedure for Partitioning Graphs , 2022 .

[11]  R. Lambiotte,et al.  Random Walks, Markov Processes and the Multiscale Modular Organization of Complex Networks , 2008, IEEE Transactions on Network Science and Engineering.

[12]  Fan Chung Graham,et al.  A random graph model for massive graphs , 2000, STOC '00.

[13]  Avrim Blum,et al.  Correlation Clustering , 2004, Machine Learning.

[14]  Massimo Marchiori,et al.  Error and attacktolerance of complex network s , 2004 .

[15]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[17]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[18]  Sergio Gómez,et al.  Size reduction of complex networks preserving modularity , 2007, ArXiv.

[19]  A. Barabasi,et al.  Scale-free characteristics of random networks: the topology of the world-wide web , 2000 .

[20]  George B. Dantzig,et al.  Solution of a Large-Scale Traveling-Salesman Problem , 1954, Oper. Res..

[21]  William J. Cook,et al.  Certification of an optimal TSP tour through 85, 900 cities , 2009, Oper. Res. Lett..

[22]  Dumitru Dumitrescu,et al.  Community Detection in Complex Networks Using Collaborative Evolutionary Algorithms , 2007, ECAL.

[23]  Moses Charikar,et al.  Maximizing quadratic programs: extending Grothendieck's inequality , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[24]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[25]  Claudio Castellano,et al.  Community Structure in Graphs , 2007, Encyclopedia of Complexity and Systems Science.

[26]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[27]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.