Community Detection in Scale-Free Networks: Approximation Algorithms for Maximizing Modularity

Many networks, indifferent of their function and scope, converge to a scale-free architecture in which the degree distribution approximately follows a power law. Meanwhile, many of those scale-free networks are found to be naturally divided into communities of densely connected nodes, known as community structure. Finding this community structure is a fundamental but challenging topic in network science. Since Newman's suggestion of using modularity as a measure to qualify the strength of community structure, many efficient methods that find community structure based on maximizing modularity have been proposed. However, there is a lack of approximation algorithms that provide provable quality bounds for the problem. In this paper, we propose polynomial-time approximation algorithms for the modularity maximization problem together with their theoretical justifications in the context of scale-free networks. We prove that the solutions of the proposed algorithms, even in the worst-case, are optimal up to a constant factor for scale-free networks with either bidirectional or unidirectional links. Even though our focus in this work is not on designing another empirically good algorithms to detect community structure, experiments on real-world networks suggest that the proposed algorithm is competitive with the state-of-the-art modularity maximization algorithm.

[1]  A. Barabasi,et al.  Evolution of the social network of scientific collaborations , 2001, cond-mat/0104162.

[2]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[3]  P. Hansen,et al.  Column generation algorithms for exact modularity maximization in networks. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[5]  Avrim Blum,et al.  Correlation Clustering , 2004, Machine Learning.

[6]  Massimo Marchiori,et al.  Error and attacktolerance of complex network s , 2004 .

[7]  Ying Xuan,et al.  Modularity-Maximizing Graph Communities via Mathematical Programming , 2009 .

[8]  E A Leicht,et al.  Community structure in directed networks. , 2007, Physical review letters.

[9]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Pan Hui,et al.  BUBBLE Rap: Social-Based Forwarding in Delay-Tolerant Networks , 2008, IEEE Transactions on Mobile Computing.

[11]  M. A. Muñoz,et al.  Scale-free networks from varying vertex intrinsic fitness. , 2002, Physical review letters.

[12]  Ying Xuan,et al.  Towards social-aware routing in dynamic communication networks , 2009, 2009 IEEE 28th International Performance Computing and Communications Conference.

[13]  FaloutsosMichalis,et al.  On power-law relationships of the Internet topology , 1999 .

[14]  P. Ronhovde,et al.  Multiresolution community detection for megascale networks by information-based replica correlations. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  My T. Thai,et al.  Finding Community Structure with Performance Guarantees in Scale-Free Networks , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[16]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[17]  A. Barabasi,et al.  Scale-free characteristics of random networks: the topology of the world-wide web , 2000 .

[18]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[19]  Yao Zhao,et al.  BotGraph: Large Scale Spamming Botnet Detection , 2009, NSDI.

[20]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Tanya Y. Berger-Wolf,et al.  Constant-factor approximation algorithms for identifying dynamic communities , 2009, KDD.

[22]  Fan Chung Graham,et al.  Random evolution in massive graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[23]  J. Jiang,et al.  Network model of deviation from power-law distribution in complex network , 2011 .

[24]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[25]  Sencun Zhu,et al.  A Social Network Based Patching Scheme for Worm Containment in Cellular Networks , 2009, IEEE INFOCOM 2009.

[26]  R. Guimerà,et al.  Functional cartography of complex metabolic networks , 2005, Nature.

[27]  Michael Kaminsky,et al.  SybilGuard: defending against sybil attacks via social networks , 2006, SIGCOMM.

[28]  Fan Chung Graham,et al.  A random graph model for massive graphs , 2000, STOC '00.

[29]  My T. Thai,et al.  The walls have ears: optimize sharing for visibility and privacy in online social networks , 2012, CIKM.

[30]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[31]  Krishna P. Gummadi,et al.  An analysis of social network-based Sybil defenses , 2010, SIGCOMM 2010.

[32]  Nam P. Nguyen,et al.  Overlapping communities in dynamic networks: their detection and mobile applications , 2011, MobiCom.

[33]  Benjamin H. Good,et al.  Performance of modularity maximization in practical contexts. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[35]  Nam P. Nguyen,et al.  Adaptive algorithms for detecting community structure in dynamic social networks , 2011, 2011 Proceedings IEEE INFOCOM.

[36]  Andreas Noack,et al.  Modularity clustering is force-directed layout , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[37]  Claudio Castellano,et al.  Community Structure in Graphs , 2007, Encyclopedia of Complexity and Systems Science.

[38]  Cecilia Mascolo,et al.  Selective Reprogramming of Mobile Sensor Networks through Social Community Detection , 2010, EWSN.

[39]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[40]  Moses Charikar,et al.  Maximizing quadratic programs: extending Grothendieck's inequality , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[41]  G. B. A. Barab'asi Competition and multiscaling in evolving networks , 2000, cond-mat/0011029.

[42]  Bhaskar DasGupta,et al.  On the complexity of Newman's community finding approach for biological and social networks , 2011, J. Comput. Syst. Sci..

[43]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.