Bayesian Model Selection for Exponential Random Graph Models via Adjusted Pseudolikelihoods

ABSTRACT Models with intractable likelihood functions arise in areas including network analysis and spatial statistics, especially those involving Gibbs random fields. Posterior parameter estimation in these settings is termed a doubly intractable problem because both the likelihood function and the posterior distribution are intractable. The comparison of Bayesian models is often based on the statistical evidence, the integral of the un-normalized posterior distribution over the model parameters which is rarely available in closed form. For doubly intractable models, estimating the evidence adds another layer of difficulty. Consequently, the selection of the model that best describes an observed network among a collection of exponential random graph models for network analysis is a daunting task. Pseudolikelihoods offer a tractable approximation to the likelihood but should be treated with caution because they can lead to an unreasonable inference. This article specifies a method to adjust pseudolikelihoods to obtain a reasonable, yet tractable, approximation to the likelihood. This allows implementation of widely used computational methods for evidence estimation and pursuit of Bayesian model selection of exponential random graph models for the analysis of social networks. Empirical comparisons to existing methods show that our procedure yields similar evidence estimates, but at a lower computational cost. Supplementary material for this article is available online.

[1]  Lennart F. Hoogerheide,et al.  A comparative study of Monte Carlo methods for efficient evaluation of marginal likelihood , 2012, Comput. Stat. Data Anal..

[2]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[3]  Xiao-Li Meng,et al.  Simulating Normalizing Constants: From Importance Sampling to Bridge Sampling to Path Sampling , 1998 .

[4]  N. Reid,et al.  AN OVERVIEW OF COMPOSITE LIKELIHOOD METHODS , 2011 .

[5]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[6]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[7]  Maurizio Dapor Monte Carlo Strategies , 2020, Transport of Energetic Electrons in Solids.

[8]  Evon M. O. Abu-Taieh,et al.  Comparative study , 2003, BMJ : British Medical Journal.

[9]  Nial Friel,et al.  Efficient Bayesian inference for exponential random graph models by correcting the pseudo-posterior distribution , 2015, Soc. Networks.

[10]  Alberto Caimo,et al.  Efficient computational strategies for doubly intractable problems with applications to Bayesian social networks , 2014, Stat. Comput..

[11]  Johan Koskinen,et al.  Bayesian Analysis of Exponential Random Graphs : Estimation of Parameters and Model Selection , 2004 .

[12]  Michael Michell Smoke Rings: social network analysis of friendship groups, smoking and drug-taking , 2000 .

[13]  H. Philippe,et al.  Computing Bayes factors using thermodynamic integration. , 2006, Systematic biology.

[14]  Nial Friel,et al.  Estimating the evidence – a review , 2011, 1111.1957.

[15]  S. Chib Marginal Likelihood from the Gibbs Output , 1995 .

[16]  Richard G. Everitt,et al.  Bayesian Parameter Estimation for Latent Markov Random Fields and Social Networks , 2012, ArXiv.

[17]  Alberto Caimo,et al.  Bayesian model selection for exponential random graph models , 2012, Soc. Networks.

[18]  Mark Girolami,et al.  The Controlled Thermodynamic Integral for Bayesian Model Evidence Evaluation , 2016 .

[19]  Anthony N. Pettitt,et al.  Efficient recursions for general factorisable models , 2004 .

[20]  Martina Morris,et al.  ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. , 2008, Journal of statistical software.

[21]  D. Hunter,et al.  Inference in Curved Exponential Family Models for Networks , 2006 .

[22]  Martina Morris,et al.  Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects. , 2008, Journal of statistical software.

[23]  Catherine B. Hurley,et al.  Advances in Dendrogram Seriation for Application to Visualization , 2015 .

[24]  Alberto Caimo,et al.  Bayesian exponential random graph models with nodal random effects , 2014, Soc. Networks.

[25]  Peng Wang,et al.  Recent developments in exponential random graph (p*) models for social networks , 2007, Soc. Networks.

[26]  J. Rosenthal,et al.  Optimal scaling for various Metropolis-Hastings algorithms , 2001 .

[27]  Richard G. Everitt,et al.  Bayesian model comparison with un-normalised likelihoods , 2015, Stat. Comput..

[28]  Alberto Caimo,et al.  Bayesian inference for exponential random graph models , 2010, Soc. Networks.

[29]  J. Møller,et al.  An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants , 2006 .

[30]  Nial Friel,et al.  Calibration of conditional composite likelihood for Bayesian inference on Gibbs random fields , 2015, AISTATS.

[31]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: I. An introduction to Markov graphs andp , 1996 .

[32]  Håvard Rue,et al.  Recursive computing and simulation-free inference for general factorizable models , 2007 .

[33]  Nial Friel Evidence and Bayes Factor Estimation for Gibbs Random Fields , 2013 .

[34]  P. Moral,et al.  Sequential Monte Carlo samplers , 2002, cond-mat/0212648.

[35]  R. B. Potts Some generalized order-disorder transformations , 1952, Mathematical Proceedings of the Cambridge Philosophical Society.

[36]  C. Geyer,et al.  Constrained Monte Carlo Maximum Likelihood for Dependent Data , 1992 .

[37]  A. Davison,et al.  Bayesian Inference from Composite Likelihoods, with an Application to Spatial Extremes , 2009, 0911.5357.

[38]  Harry Joe,et al.  Composite Likelihood Methods , 2012 .

[39]  D. J. Strauss,et al.  Pseudolikelihood Estimation for Social Networks , 1990 .

[40]  Nial Friel,et al.  Improving power posterior estimation of statistical evidence , 2012, Stat. Comput..

[41]  A. Pettitt,et al.  Marginal likelihood estimation via power posteriors , 2008 .

[42]  S. Chib,et al.  Marginal Likelihood From the Metropolis–Hastings Output , 2001 .

[43]  Xiao-Li Meng,et al.  SIMULATING RATIOS OF NORMALIZING CONSTANTS VIA A SIMPLE IDENTITY: A THEORETICAL EXPLORATION , 1996 .

[44]  Jing Wang,et al.  Approximate Bayesian Computation for Exponential Random Graph Models for Large Social Networks , 2014, Commun. Stat. Simul. Comput..

[45]  Ming-Hui Chen,et al.  Improving marginal likelihood estimation for Bayesian phylogenetic model selection. , 2011, Systematic biology.

[46]  P. Pattison,et al.  New Specifications for Exponential Random Graph Models , 2006 .

[47]  J. Besag Nearest‐Neighbour Systems and the Auto‐Logistic Model for Binary Data , 1972 .

[48]  J. Besag Efficiency of pseudolikelihood estimation for simple Gaussian fields , 1977 .

[49]  Alberto Caimo,et al.  Bergm: Bayesian Exponential Random Graphs in R , 2012, 1201.2770.

[50]  Jing Wang,et al.  Bayesian inference of exponential random graph models for large social networks , 2011 .

[51]  Andrew D. Martin,et al.  MCMCpack: Markov chain Monte Carlo in R , 2011 .

[52]  Tim Hesterberg,et al.  Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[53]  Mark A. Girolami,et al.  Estimating Bayes factors via thermodynamic integration and population MCMC , 2009, Comput. Stat. Data Anal..