Permanence and Community Structure in Complex Networks

The goal of community detection algorithms is to identify densely connected units within large networks. An implicit assumption is that all the constituent nodes belong equally to their associated community. However, some nodes are more important in the community than others. To date, efforts have been primarily made to identify communities as a whole, rather than understanding to what extent an individual node belongs to its community. Therefore, most metrics for evaluating communities, for example modularity, are global. These metrics produce a score for each community, not for each individual node. In this article, we argue that the belongingness of nodes in a community is not uniform. We quantify the degree of belongingness of a vertex within a community by a new vertex-based metric called permanence. The central idea of permanence is based on the observation that the strength of membership of a vertex to a community depends upon two factors (i) the extent of connections of the vertex within its community versus outside its community, and (ii) how tightly the vertex is connected internally. We present the formulation of permanence based on these two quantities. We demonstrate that compared to other existing metrics (such as modularity, conductance, and cut-ratio), the change in permanence is more commensurate to the level of perturbation in ground-truth communities. We discuss how permanence can help us understand and utilize the structure and evolution of communities by demonstrating that it can be used to -- (i) measure the persistence of a vertex in a community, (ii) design strategies to strengthen the community structure, (iii) explore the core-periphery structure within a community, and (iv) select suitable initiators for message spreading. We further show that permanence is an excellent metric for identifying communities. We demonstrate that the process of maximizing permanence (abbreviated as MaxPerm) produces meaningful communities that concur with the ground-truth community structure of the networks more accurately than eight other popular community detection algorithms. Finally, we provide mathematical proofs to demonstrate the correctness of finding communities by maximizing permanence. In particular, we show that the communities obtained by this method are (i) less affected by the changes in vertex ordering, and (ii) more resilient to resolution limit, degeneracy of solutions, and asymptotic growth of values.

[1]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..

[2]  Jure Leskovec,et al.  Overlapping Communities Explain Core–Periphery Organization of Networks , 2014, Proceedings of the IEEE.

[3]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[4]  Silvio Lattanzi,et al.  Rumour spreading and graph conductance , 2010, SODA '10.

[5]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[6]  Mark E. J. Newman,et al.  Community detection and graph partitioning , 2013, ArXiv.

[7]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  M. Newman Analysis of weighted networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[10]  Weixiong Zhang,et al.  Discovering link communities in complex networks by exploiting link dynamics , 2012, ArXiv.

[11]  R. Guimerà,et al.  Functional cartography of complex metabolic networks , 2005, Nature.

[12]  David A. Bader,et al.  Detecting Communities from Given Seeds in Social Networks , 2011 .

[13]  Hocine Cherifi,et al.  Comparative evaluation of community detection algorithms: a topological approach , 2012, ArXiv.

[14]  Martin Rosvall,et al.  An information-theoretic framework for resolving community structure in complex networks , 2007, Proceedings of the National Academy of Sciences.

[15]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[16]  Tanmoy Chakraborty,et al.  Leveraging disjoint communities for detecting overlapping community structure , 2015, ArXiv.

[17]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[18]  GangulyNiloy,et al.  Permanence and Community Structure in Complex Networks , 2016 .

[19]  LeskovecJure,et al.  Defining and evaluating network communities based on ground-truth , 2015 .

[20]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[22]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[23]  P. Mucha,et al.  Spectral tripartitioning of networks. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  Boleslaw K. Szymanski,et al.  Community detection using a neighborhood strength driven Label Propagation Algorithm , 2011, 2011 IEEE Network Science Workshop.

[25]  Benjamin H. Good,et al.  Performance of modularity maximization in practical contexts. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Avishek Banerjee,et al.  STATISTICAL ANALYSIS OF THE INDIAN RAILWAY NETWORK: A COMPLEX NETWORK APPROACH , 2011 .

[27]  David A. Bader,et al.  Graph Partitioning and Graph Clustering , 2013 .

[28]  Boleslaw K. Szymanski,et al.  A New Metric for Quality of Network Community Structure , 2015, ArXiv.

[29]  E A Leicht,et al.  Community structure in directed networks. , 2007, Physical review letters.

[30]  J. Reichardt,et al.  Statistical mechanics of community detection. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  P. Holland,et al.  TRANSITIVITY IN STRUCTURAL MODELS OF SMALL GROUPS , 1977 .

[32]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[33]  Jonathan W. Berry,et al.  Tolerating the community detection resolution limit with edge weighting. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  Pasquale De Meo,et al.  Enhancing community detection using a network weighting strategy , 2013, Inf. Sci..

[35]  T. Vicsek,et al.  Weighted network modules , 2007, cond-mat/0703706.

[36]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Boleslaw K. Szymanski,et al.  Towards Linear Time Overlapping Community Detection in Social Networks , 2012, PAKDD.

[38]  Jean-Loup Guillaume,et al.  Stable Community Cores in Complex Networks , 2012, CompleNet.

[39]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Lin Gao,et al.  Identification of overlapping and non-overlapping community structure by fuzzy clustering in complex networks , 2011, Inf. Sci..

[41]  Niloy Ganguly,et al.  Computer science fields as ground-truth communities: Their impact, rise and fall , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[42]  Stephen Roberts,et al.  Overlapping community detection using Bayesian non-negative matrix factorization. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[43]  M. Cugmas,et al.  On comparing partitions , 2015 .

[44]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[45]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, KDD 2012.

[46]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[47]  Sanjukta Bhowmick,et al.  GenPerm: A Unified Method for Detecting Non-Overlapping and Overlapping Communities , 2016, IEEE Transactions on Knowledge and Data Engineering.

[48]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[49]  R. Lambiotte,et al.  Line graphs, link partitions, and overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[50]  Santo Fortunato,et al.  Consensus clustering in complex networks , 2012, Scientific Reports.

[51]  Jean-Charles Delvenne,et al.  Stability of graph communities across time scales , 2008, Proceedings of the National Academy of Sciences.

[52]  Alex Arenas,et al.  Analysis of the structure of complex networks at different resolution levels , 2007, physics/0703218.

[53]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[54]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[55]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[56]  Sanjukta Bhowmick,et al.  Constant Communities in Complex Networks , 2013, Scientific Reports.

[57]  M. Newman,et al.  Mixing patterns in networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[58]  Jiawei Han,et al.  ACM Transactions on Knowledge Discovery from Data: Introduction , 2007 .

[59]  Sanjukta Bhowmick,et al.  On the permanence of vertices in network communities , 2014, KDD.

[60]  Santo Fortunato,et al.  Finding Statistically Significant Communities in Networks , 2010, PloS one.

[61]  Andrea Lancichinetti,et al.  Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[62]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[63]  Niloy Ganguly,et al.  Metrics for Community Analysis , 2016, ACM Comput. Surv..

[64]  Réka Albert,et al.  Near linear time algorithm to detect community structures in large-scale networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[65]  Renaud Lambiotte,et al.  Multi-scale modularity in complex networks , 2010, 8th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks.

[66]  Santo Fortunato,et al.  Limits of modularity maximization in community detection , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[67]  Scott Shenker,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[68]  Malik Magdon-Ismail,et al.  Efficient Identification of Overlapping Communities , 2005, ISI.