Mixture models with entropy regularization for community detection in networks

Abstract Community detection is a key exploratory tool in network analysis and has received much attention in recent years. NMM (Newman’s mixture model) is one of the best models for exploring a range of network structures including community structure, bipartite and core–periphery structures, etc. However, NMM needs to know the number of communities in advance. Therefore, in this study, we have proposed an entropy regularized mixture model (called EMM), which is capable of inferring the number of communities and identifying network structure contained in a network, simultaneously. In the model, by minimizing the entropy of mixing coefficients of NMM using EM (expectation–maximization) solution, the small clusters contained little information can be discarded step by step. The empirical study on both synthetic networks and real networks has shown that the proposed model EMM is superior to the state-of-the-art methods.

[1]  R. Guimerà,et al.  Modularity from fluctuations in random graphs and complex networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Jeffrey W. Miller,et al.  Mixture Models With a Prior on the Number of Components , 2015, Journal of the American Statistical Association.

[3]  Mark E. J. Newman,et al.  An efficient and principled method for detecting communities in networks , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Meng Cai,et al.  Analysis and evaluation of the entropy indices of a static network structure , 2017, Scientific Reports.

[5]  Shihua Zhang,et al.  Identification of overlapping community structure in complex networks using fuzzy c-means clustering , 2007 .

[6]  Matteo Pellegrini,et al.  Detecting Communities Based on Network Topology , 2014, Scientific Reports.

[7]  J. Ramasco,et al.  Inversion method for content-based networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  E A Leicht,et al.  Mixture models and exploratory analysis in networks , 2006, Proceedings of the National Academy of Sciences.

[9]  Mark E. J. Newman,et al.  Structure and inference in annotated networks , 2015, Nature Communications.

[10]  M. Newman Community detection in networks: Modularity optimization and maximum likelihood are equivalent , 2016, Physical review. E.

[11]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[13]  Yixin Cao,et al.  Identifying overlapping communities as well as hubs and outliers via nonnegative matrix factorization , 2013, Scientific Reports.

[14]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[15]  Manuel Théry,et al.  Network heterogeneity regulates steering in actin-based motility , 2017, Nature Communications.

[16]  M. Newman Communities, modules and large-scale structure in networks , 2011, Nature Physics.

[17]  Jian Yu,et al.  Node Attribute-enhanced Community Detection in Complex Networks , 2017, Scientific Reports.

[18]  J. Wang,et al.  Detecting groups of similar components in complex networks , 2008, 0808.1612.

[19]  A. Barabasi,et al.  Uncovering disease-disease relationships through the incomplete interactome , 2015, Science.

[20]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  Buzhou Tang,et al.  Network structure exploration in networks with node attributes , 2016 .

[22]  Mark Newman,et al.  Detecting community structure in networks , 2004 .

[23]  Cristopher Moore,et al.  Phase transition in the detection of modules in sparse networks , 2011, Physical review letters.

[24]  Jure Leskovec,et al.  Higher-order organization of complex networks , 2016, Science.

[25]  A. Arenas,et al.  Community detection in complex networks using extremal optimization. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Gesine Reinert,et al.  Estimating the number of communities in a network , 2016, Physical review letters.

[27]  Jian Yu,et al.  Combining a popularity-productivity stochastic block model with a discriminative-content model for general structure detection. , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  Buzhou Tang,et al.  Network structure exploration via Bayesian nonparametric models , 2014, 1403.0466.

[29]  Miin-Shen Yang,et al.  A robust EM clustering algorithm for Gaussian mixture models , 2012, Pattern Recognit..

[30]  R. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 2001 .

[31]  Benjamin H. Good,et al.  Performance of modularity maximization in practical contexts. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[33]  J. Pitman Exchangeable and partially exchangeable random partitions , 1995 .

[34]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[35]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[36]  Eyke Hüllermeier,et al.  On the bayes-optimality of F-measure maximizers , 2013, J. Mach. Learn. Res..

[37]  Albert-László Barabási,et al.  Universal resilience patterns in complex networks , 2016, Nature.