Efficient method for estimating the number of communities in a network

While there exist a wide range of effective methods for community detection in networks, most of them require one to know in advance how many communities one is looking for. Here we present a method for estimating the number of communities in a network using a combination of Bayesian inference with a novel prior and an efficient Monte Carlo sampling scheme. We test the method extensively on both real and computer-generated networks, showing that it performs accurately and consistently, even in cases where groups are widely varying in size or structure.

[1]  P. Latouche,et al.  Model selection and clustering in stochastic block models based on the exact integrated complete data likelihood , 2015 .

[2]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[3]  Cristopher Moore,et al.  Phase transition in the detection of modules in sparse networks , 2011, Physical review letters.

[4]  A. Raftery,et al.  Model‐based clustering for social networks , 2007 .

[5]  Donald E. Knuth,et al.  The Stanford GraphBase - a platform for combinatorial computing , 1993 .

[6]  Roger Guimerà,et al.  Missing and spurious interactions and the reconstruction of complex networks , 2009, Proceedings of the National Academy of Sciences.

[7]  Christophe Ambroise,et al.  Bayesian Methods for Graph Clustering , 2008, GfKl.

[8]  Gerard T. Barkema,et al.  Monte Carlo Methods in Statistical Physics , 1999 .

[9]  Gesine Reinert,et al.  Estimating the number of communities in a network , 2016, Physical review letters.

[10]  Neil J. Hurley,et al.  Improved Bayesian inference for the stochastic block model with application to large networks , 2012, Comput. Stat. Data Anal..

[11]  Elchanan Mossel,et al.  Reconstruction and estimation in the planted partition model , 2012, Probability Theory and Related Fields.

[12]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[14]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Christophe Ambroise,et al.  Variational Bayesian inference and complexity control for stochastic block models , 2009, 0912.2873.

[16]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[17]  Tiago P. Peixoto Hierarchical block structures and high-resolution model selection in large networks , 2013, ArXiv.

[18]  Tiago P. Peixoto Nonparametric Bayesian inference of the microcanonical stochastic block model. , 2016, Physical review. E.

[19]  Xiaoran Yan,et al.  Bayesian model selection of stochastic block models , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[20]  Franck Picard,et al.  A mixture model for random graphs , 2008, Stat. Comput..

[21]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[22]  Jure Leskovec,et al.  Defining and Evaluating Network Communities Based on Ground-Truth , 2012, ICDM.

[23]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  Laurent Massoulié,et al.  Community detection thresholds and the weak Ramanujan property , 2013, STOC.

[25]  Santo Fortunato,et al.  Community detection in networks: A user guide , 2016, ArXiv.

[26]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.