Learning Latent Block Structure in Weighted Networks

Community detection is an important task in network analysis, in which we aim to learn a network partition that groups together vertices with similar community-level connectivity patterns. By finding such groups of vertices with similar structural roles, we extract a compact representation of the network's large-scale structure, which can facilitate its scientific interpretation and the prediction of unknown or future interactions. Popular approaches, including the stochastic block model, assume edges are unweighted, which limits their utility by throwing away potentially useful information. We introduce the `weighted stochastic block model' (WSBM), which generalizes the stochastic block model to networks with edge weights drawn from any exponential family distribution. This model learns from both the presence and weight of edges, allowing it to discover structure that would otherwise be hidden when weights are discarded or thresholded. We describe a Bayesian variational algorithm for efficiently approximating this model's posterior distribution over latent block structures. We then evaluate the WSBM's performance on both edge-existence and edge-weight prediction tasks for a set of real-world weighted networks. In all cases, the WSBM performs as well or better than the best alternatives on these tasks.

[1]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[2]  Roger Guimerà,et al.  A Network Inference Method for Large-Scale Unsupervised Identification of Novel Drug-Drug Interactions , 2013, PLoS Comput. Biol..

[3]  Tiago P Peixoto,et al.  Parsimonious module inference in large networks. , 2012, Physical review letters.

[4]  Hagai Attias,et al.  A Variational Bayesian Framework for Graphical Models , 1999 .

[5]  Aaron Clauset,et al.  Scoring dynamics across professional team sports: tempo, balance and predictability , 2013, EPJ Data Science.

[6]  Alain Celisse,et al.  Consistency of maximum-likelihood and variational estimators in the Stochastic Block Model , 2011, 1105.3288.

[7]  Tiago P. Peixoto Hierarchical block structures and high-resolution model selection in large networks , 2013, ArXiv.

[8]  J. Bader,et al.  Dynamic Networks from Hierarchical Bayesian Graph Clustering , 2010, PloS one.

[9]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[10]  Aaron Clauset,et al.  Adapting the Stochastic Block Model to Edge-Weighted Networks , 2013, ArXiv.

[11]  P. Latouche,et al.  Model selection and clustering in stochastic block models with the exact integrated complete data likelihood , 2013, 1303.2962.

[12]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[13]  Mason A. Porter,et al.  A network analysis of committees in the United States House of Representatives , 2005, ArXiv.

[14]  Leto Peel,et al.  Detecting Change Points in the Large-Scale Structure of Evolving Networks , 2014, AAAI.

[15]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[16]  C. Matias,et al.  Parameter identifiability in a class of random graph mixture models , 2010, 1006.0826.

[17]  Cristopher Moore,et al.  Model selection for degree-corrected block models , 2012, Journal of statistical mechanics.

[18]  A. W. Kemp,et al.  Kendall's Advanced Theory of Statistics. , 1994 .

[19]  M. Newman,et al.  Mixing patterns in networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  P. Latouche,et al.  Model selection and clustering in stochastic block models based on the exact integrated complete data likelihood , 2015 .

[21]  Cristopher Moore,et al.  Structural Inference of Hierarchies in Networks , 2006, SNA@ICML.

[22]  Yuchung J. Wang,et al.  Stochastic Blockmodels for Directed Graphs , 1987 .

[23]  Florent Krzakala,et al.  Comparative study for inference of hidden classes in stochastic block models , 2012, ArXiv.

[24]  Tore Opsahl,et al.  Clustering in weighted networks , 2009, Soc. Networks.

[25]  Chris H Wiggins,et al.  Bayesian approach to network modularity. , 2007, Physical review letters.

[26]  Roger Guimerà,et al.  Missing and spurious interactions and the reconstruction of complex networks , 2009, Proceedings of the National Academy of Sciences.

[27]  Christian P. Robert,et al.  The Bayesian choice : from decision-theoretic foundations to computational implementation , 2007 .

[28]  Santo Fortunato,et al.  World citation and collaboration networks: uncovering the role of geography in science , 2012, Scientific Reports.

[29]  C. Matias,et al.  New consistent and asymptotically normal parameter estimates for random‐graph mixture models , 2012 .

[30]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[31]  Andrew C. Thomas,et al.  Valued Ties Tell Fewer Lies: Why Not To Dichotomize Network Edges With Thresholds , 2011, ArXiv.

[32]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[33]  Daniel B. Larremore,et al.  Efficiently inferring community structure in bipartite networks , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  Alessandro Vespignani,et al.  Reaction–diffusion processes and metapopulation models in heterogeneous networks , 2007, cond-mat/0703129.