Bayesian model selection for the latent position cluster model for social networks

The latent position cluster model is a popular model for the statistical analysis of network data. This approach assumes that there is an underlying latent space in which the actors follow a finite mixture distribution. Moreover, actors which are close in this latent space tend to be tied by an edge. This is an appealing approach since it allows the model to cluster actors which consequently provides the practitioner with useful qualitative information. However, exploring the uncertainty in the number of underlying latent components in the mixture distribution is a very complex task. The current state-of-the-art is to use an approximate form of BIC for this purpose, where an approximation of the log-likelihood is used instead of the true log-likelihood which is unavailable. The main contribution of this paper is to show that through the use of conjugate prior distributions it is possible to analytically integrate out almost all of the model parameters, leaving a posterior distribution which depends on the allocation vector of the mixture model. A consequence of this is that it is possible to carry out posterior inference over the number of components in the latent mixture distribution without using trans-dimensional MCMC algorithms such as reversible jump MCMC. Moreover, our algorithm allows for more reasonable computation times for larger networks than the standard methods using the latentnet package (Krivitsky and Handcock 2008; Krivitsky and Handcock 2013).

[1]  R. Sibson Studies in the Robustness of Multidimensional Scaling: Procrustes Statistics , 1978 .

[2]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[3]  George Michailidis,et al.  Statistical Challenges in Biological Networks , 2012 .

[4]  Petros Dellaportas,et al.  Multivariate mixtures of normals with unknown number of components , 2006, Stat. Comput..

[5]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[6]  Eric D. Kolaczyk,et al.  Statistical Analysis of Network Data: Methods and Models , 2009 .

[7]  Jeffrey W. Miller,et al.  Mixture Models With a Prior on the Number of Components , 2015, Journal of the American Statistical Association.

[8]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[9]  D. Lusseau,et al.  The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations , 2003, Behavioral Ecology and Sociobiology.

[10]  M. Stephens Bayesian analysis of mixture models with an unknown number of components- an alternative to reversible jump methods , 2000 .

[11]  Pavel N Krivitsky,et al.  Fitting Position Latent Cluster Models for Social Networks with latentnet. , 2008, Journal of statistical software.

[12]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[13]  Nial Friel,et al.  Block clustering with collapsed latent block models , 2010, Statistics and Computing.

[14]  Peter D. Hoff,et al.  Fast Inference for the Latent Space Network Model Using a Case-Control Approximate Likelihood , 2012, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[15]  Thomas Brendan Murphy,et al.  Variational Bayesian inference for the Latent Position Cluster Model , 2009, NIPS 2009.

[16]  Agostino Nobile,et al.  Bayesian finite mixtures with an unknown number of components: The allocation sampler , 2007, Stat. Comput..

[17]  P. Pattison LOGIT MODELS AND LOGISTIC REGRESSIONS FOR SOCIAL NETWORKS: I. AN INTRODUCTION TO MARKOV GRAPHS AND p* STANLEY WASSERMAN UNIVERSITY OF ILLINOIS , 1996 .

[18]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[19]  FaloutsosMichalis,et al.  On power-law relationships of the Internet topology , 1999 .

[20]  Peng Wang,et al.  Recent developments in exponential random graph (p*) models for social networks , 2007, Soc. Networks.

[21]  N. Breslow,et al.  Statistics in Epidemiology : The Case-Control Study , 2008 .

[22]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[23]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[24]  S. Wasserman,et al.  Advances in Social Network Analysis: Research in the Social and Behavioral Sciences , 1994 .

[25]  Paolo Toth,et al.  Algorithm 548: Solution of the Assignment Problem [H] , 1980, TOMS.

[26]  R. Sibson Studies in the Robustness of Multidimensional Scaling: Perturbational Analysis of Classical Scaling , 1979 .

[27]  A. Nobile Bayesian finite mixtures: a note on prior specification and posterior computation , 2007, 0711.0458.

[28]  Pavel N. Krivitsky,et al.  Latent Position and Cluster Models for Statistical Networks , 2015 .

[29]  Eric D. Kolaczyk,et al.  Statistical Analysis of Network Data , 2009 .

[30]  Nial Friel,et al.  Estimating the evidence – a review , 2011, 1111.1957.

[31]  A. Raftery,et al.  Model‐based clustering for social networks , 2007 .

[32]  Lada A. Adamic,et al.  Search in Power-Law Networks , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  Walter R. Gilks,et al.  Bayesian model comparison via jump diffusions , 1995 .

[34]  Susan M. Shortreed,et al.  Positional Estimation Within a Latent Space Model for Networks , 2006 .

[35]  Adrian E. Raftery,et al.  Enhanced Model-Based Clustering, Density Estimation, and Discriminant Analysis Software: MCLUST , 2003, J. Classif..

[36]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .