Efficiently and Fast Learning a Fine-grained Stochastic Blockmodel from Large Networks

Stochastic blockmodel (SBM) has recently come into the spotlight in the domains of social network analysis and statistical machine learning, as it enables us to decompose and then analyze an exploratory network without knowing any priori information about its intrinsic structure. However, the prohibitive computational cost limits SBM learning algorithm with the capability of model selection to small network with hundreds of nodes. This paper presents a fine-gained SBM and its fast learning algorithm, named FSL, which ingeniously combines the component-wise EM (CEM) algorithm and minimum message length (MML) together to achieve the parallel learning of parameter estimation and model evaluation. The FSL significantly reduces the time complexity of the learning algorithm, and scales to network with thousands of nodes. The experimental results indicate that the FSL can achieve the best tradeoff between effectiveness and efficiency through greatly reducing learning time while preserving competitive learning accuracy. Moreover, it is noteworthy that our proposed method shows its excellent generalization ability through the application of link prediction.

[1]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[3]  E A Leicht,et al.  Mixture models and exploratory analysis in networks , 2006, Proceedings of the National Academy of Sciences.

[4]  Dayou Liu,et al.  Mathematical modeling for active and dynamic diagnosis of crop diseases based on Bayesian networks and incremental learning , 2013, Math. Comput. Model..

[5]  D. Lusseau,et al.  The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations , 2003, Behavioral Ecology and Sociobiology.

[6]  Dayou Liu,et al.  Characterizing and Extracting Multiplex Patterns in Complex Networks , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7]  A. Arenas,et al.  Community detection in complex networks using extremal optimization. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Franck Picard,et al.  A mixture model for random graphs , 2008, Stat. Comput..

[10]  Xueqi Cheng,et al.  Exploring the structural regularities in networks , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Gilles Celeux,et al.  A Component-Wise EM Algorithm for Mixtures , 2001, 1201.5913.

[13]  P. Latouche,et al.  Overlapping stochastic block models with application to the French political blogosphere , 2009, 0910.2098.

[14]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[15]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[16]  Chris H Wiggins,et al.  Bayesian approach to network modularity. , 2007, Physical review letters.

[17]  Christophe Ambroise,et al.  Variational Bayesian inference and complexity control for stochastic block models , 2009, 0912.2873.