Network Clustering via Maximizing Modularity: Approximation Algorithms and Theoretical Limits

Many social networks and complex systems are found to be naturally divided into clusters of densely connected nodes, known as community structure (CS). Finding CS is one of fundamental yet challenging topics in network science. One of the most popular classes of methods for this problem is to maximize Newman's modularity. However, there is a little understood on how well we can approximate the maximum modularity as well as the implications of finding community structure with provable guarantees. In this paper, we settle definitely the approximability of modularity clustering, proving that approximating the problem within any (multiplicative) positive factor is intractable, unless P = NP. Yet we propose the first additive approximation algorithm for modularity clustering with a constant factor. Moreover, we provide a rigorous proof that a CS with modularity arbitrary close to maximum modularity QOPT might bear no similarity to the optimal CS of maximum modularity. Thus even when CS with near-optimal modularity are found, other verification methods are needed to confirm the significance of the structure.

[1]  Farhad Shahrokhi,et al.  Sparsest cuts and bottlenecks in graphs , 1990, Discret. Appl. Math..

[2]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[3]  Kim-Chuan Toh,et al.  Solving semidefinite-quadratic-linear programs using SDPT3 , 2003, Math. Program..

[4]  Bhaskar DasGupta,et al.  On the complexity of Newman's community finding approach for biological and social networks , 2011, J. Comput. Syst. Sci..

[5]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  Sanjeev Arora,et al.  Computational Complexity: A Modern Approach , 2009 .

[7]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[8]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[9]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[10]  Paulo Shakarian,et al.  Mining for geographically disperse communities in social networks by leveraging distance modularity , 2013, KDD.

[11]  My T. Thai,et al.  Towards Optimal Community Detection: From Trees to General Weighted Networks , 2012 .

[12]  Claudio Castellano,et al.  Community Structure in Graphs , 2007, Encyclopedia of Complexity and Systems Science.

[13]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[14]  Venkatesan Guruswami,et al.  Clustering with qualitative information , 2005, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[15]  Franz Rendl,et al.  Solving Max-Cut to optimality by intersecting semidefinite and polyhedral relaxations , 2009, Math. Program..

[16]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[17]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[18]  P. Hansen,et al.  Column generation algorithms for exact modularity maximization in networks. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  Nam P. Nguyen,et al.  An adaptive approximation algorithm for community detection in dynamic scale-free networks , 2013, 2013 Proceedings IEEE INFOCOM.

[20]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[21]  Jianhua Ruan,et al.  A Fully Automated Method for Discovering Community Structures in High Dimensional Data , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[22]  My T. Thai,et al.  Community Detection in Scale-Free Networks: Approximation Algorithms for Maximizing Modularity , 2013, IEEE Journal on Selected Areas in Communications.

[23]  James Parker,et al.  on Knowledge and Data Engineering, , 1990 .

[24]  David Kempe,et al.  Modularity-maximizing graph communities via mathematical programming , 2007, 0710.2533.

[25]  Benjamin H. Good,et al.  Performance of modularity maximization in practical contexts. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Paul M. B. Vitányi How well can a graph be n-colored? , 1981, Discret. Math..