The dynamic stochastic topic block model for dynamic networks with textual edges

The present paper develops a probabilistic model to cluster the nodes of a dynamic graph, accounting for the content of textual edges as well as their frequency. Vertices are clustered in groups which are homogeneous both in terms of interaction frequency and discussed topics. The dynamic graph is considered stationary on a latent time interval if the proportions of topics discussed between each pair of node groups do not change in time during that interval. A classification variational expectation–maximization algorithm is adopted to perform inference. A model selection criterion is also derived to select the number of node groups, time clusters and topics. Experiments on simulated data are carried out to assess the proposed methodology. We finally illustrate an application to the Enron dataset.

[1]  Fabrice Rossi,et al.  Block modelling in dynamic networks with non-homogeneous Poisson processes and exact ICL , 2016, Social Network Analysis and Mining.

[2]  P. Latouche,et al.  Model selection and clustering in stochastic block models based on the exact integrated complete data likelihood , 2015 .

[3]  Charles Bouveyron,et al.  The random subgraph model for the analysis of an ecclesiastical network in Merovingian Gaul , 2012, 1212.5497.

[4]  Hongyuan Zha,et al.  Probabilistic models for discovering e-communities , 2006, WWW '06.

[5]  Garry Robins,et al.  An introduction to exponential random graph (p*) models for social networks , 2007, Soc. Networks.

[6]  Camille Roth,et al.  Natural Scales in Geographical Patterns , 2017, Scientific Reports.

[7]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[8]  L. Venkata Subramaniam,et al.  Using content and interactions for discovering communities in social networks , 2012, WWW.

[9]  Alfred O. Hero,et al.  Dynamic Stochastic Blockmodels: Statistical Models for Time-Evolving Networks , 2013, SBP.

[10]  Daniele Durante,et al.  Locally Adaptive Dynamic Networks , 2015, 1505.05668.

[11]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[12]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Charles Bouveyron,et al.  The stochastic topic block model for the clustering of vertices in networks with textual edges , 2016, Statistics and Computing.

[14]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[15]  Fabrice Rossi,et al.  Multiple change points detection and clustering in dynamic networks , 2017, Statistics and Computing.

[16]  G. Celeux,et al.  A Classification EM algorithm for clustering and two stochastic versions , 1992 .

[17]  BiernackiChristophe,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000 .

[18]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[19]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[21]  Yihong Gong,et al.  Detecting communities and their evolutions in dynamic social networks—a Bayesian approach , 2011, Machine Learning.

[22]  Vincent Miele,et al.  Statistical clustering of temporal networks through a dynamic stochastic block model , 2015, 1506.07464.

[23]  Yuguo Chen,et al.  Latent Space Models for Dynamic Networks , 2015, 2005.08808.

[24]  Yuchung J. Wang,et al.  Stochastic Blockmodels for Directed Graphs , 1987 .

[25]  Fabrice Rossi,et al.  A Triclustering Approach for Time Evolving Graphs , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[26]  Eric P. Xing,et al.  Discrete Temporal Models of Social Networks , 2006, SNA@ICML.

[27]  A. Banerjee,et al.  Social Topic Models for Community Extraction , 2008 .

[28]  Fabrice Rossi,et al.  Discovering patterns in time-varying graphs: a triclustering approach , 2015, Advances in Data Analysis and Classification.

[29]  Fabrice Rossi,et al.  Modelling time evolving interactions in networks through a non stationary extension of stochastic block models , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[30]  C. Matias,et al.  Estimation and clustering in a semiparametric Poisson process stochastic block model for longitudinal networks , 2015 .

[31]  Charles Bouveyron,et al.  The dynamic random subgraph model for the clustering of evolving networks , 2016, Computational Statistics.

[32]  Franck Picard,et al.  A mixture model for random graphs , 2008, Stat. Comput..

[33]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[34]  Pavel N Krivitsky,et al.  A separable model for dynamic networks , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[35]  Kurt Hornik,et al.  topicmodels : An R Package for Fitting Topic Models , 2016 .

[36]  Andrew McCallum,et al.  The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks: Experiments with Enron and Academic Email , 2005 .

[37]  Yuguo Chen,et al.  Latent space models for dynamic networks with weighted edges , 2020, Soc. Networks.

[38]  A. Moore,et al.  Dynamic social network analysis using latent space models , 2005, SKDD.

[39]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[40]  A. Raftery,et al.  Model‐based clustering for social networks , 2007 .

[41]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[42]  Fabrice Rossi,et al.  Exact ICL maximization in a non-stationary temporal extension of the stochastic block model for dynamic networks , 2016, Neurocomputing.

[43]  Christophe Ambroise,et al.  Variational Bayesian inference and complexity control for stochastic block models , 2009, 0912.2873.

[44]  Pierre Latouche,et al.  Bayesian non parametric inference of discrete valued networks , 2013, ESANN.

[45]  Yan Liu,et al.  Topic-link LDA: joint models of topic and author community , 2009, ICML '09.

[46]  Thomas L. Griffiths,et al.  Probabilistic author-topic models for information discovery , 2004, KDD.

[47]  Leto Peel,et al.  Detecting Change Points in the Large-Scale Structure of Evolving Networks , 2014, AAAI.

[48]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[49]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .

[50]  Adrian E Raftery,et al.  Interlocking directorates in Irish companies using a latent space model for bipartite networks , 2016, Proceedings of the National Academy of Sciences.