Estimating the Integrated Likelihood via Posterior Simulation Using the Harmonic Mean Identity

The integrated likelihood (also called the marginal likelihood or the normalizing constant) is a central quantity in Bayesian model selection and model averaging. It is defined as the integral over the parameter space of the likelihood times the prior density. The Bayes factor for model comparison and Bayesian testing is a ratio of integrated likelihoods, and the model weights in Bayesian model averaging are proportional to the integrated likelihoods. We consider the estimation of the integrated likelihood from posterior simulation output, aiming at a generic method that uses only the likelihoods from the posterior simulation iterations. The key is the harmonic mean identity, which says that the reciprocal of the integrated likelihood is equal to the posterior harmonic mean of the likelihood. The simplest estimator based on the identity is thus the harmonic mean of the likelihoods. While this is an unbiased and simulation-consistent estimator, its reciprocal can have infinite variance and so it is unstable in general.

[1]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[2]  R. Plackett The Analysis of Permutations , 1975 .

[3]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[4]  Anne Lohrli Chapman and Hall , 1985 .

[5]  D. Clayton,et al.  Empirical Bayes estimates of age-standardized relative risks for use in disease mapping. , 1987, Biometrics.

[6]  Peter J. Bickel,et al.  A Decomposition for the Likelihood Ratio Statistic and the Bartlett Correction--A Bayesian Argument , 1990 .

[7]  J. Besag,et al.  Bayesian image restoration, with two applications in spatial statistics , 1991 .

[8]  A. Dawid Fisherian Inference in Likelihood and Prequential Frames of Reference , 1991 .

[9]  M. Newton Approximate Bayesian-inference With the Weighted Likelihood Bootstrap , 1994 .

[10]  Ming-Hui Chen Importance-Weighted Marginal Bayesian Posterior Density Estimation , 1994 .

[11]  C. Robert,et al.  Estimation of Finite Mixture Distributions Through Bayesian Sampling , 1994 .

[12]  A. Gelfand,et al.  Bayesian Model Choice: Asymptotics and Exact Calculations , 1994 .

[13]  A. Raftery Bayesian Model Selection in Social Research , 1995 .

[14]  Walter R. Gilks,et al.  Hypothesis testing and model selection , 1995 .

[15]  L. Wasserman,et al.  A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz Criterion , 1995 .

[16]  David Draper,et al.  Assessment and Propagation of Model Uncertainty , 2011 .

[17]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[18]  L. Wasserman,et al.  Computing Bayes Factors Using a Generalization of the Savage-Dickey Density Ratio , 1995 .

[19]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[20]  S. Chib Marginal Likelihood from the Gibbs Output , 1995 .

[21]  Adrian E. Raftery,et al.  Hypothesis testing and model selection , 1996 .

[22]  M A Newton,et al.  A bayesian approach to detect quantitative trait loci using Markov chain Monte Carlo. , 1996, Genetics.

[23]  L. Wasserman,et al.  Computing Bayes Factors by Combining Simulation and Asymptotic Approximations , 1997 .

[24]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[25]  A. Raftery,et al.  Estimating Bayes Factors via Posterior Simulation with the Laplace—Metropolis Estimator , 1997 .

[26]  Bradley P. Carlin,et al.  BAYES AND EMPIRICAL BAYES METHODS FOR DATA ANALYSIS , 1996, Stat. Comput..

[27]  Rebecca W. Doerge,et al.  Statistical issues in the search for genes affecting quantitative traits in experimental populations , 1997 .

[28]  D. Pauler The Schwarz criterion and related methods for normal linear models , 1998 .

[29]  Jianqing Fan,et al.  Geometric Understanding of Likelihood Ratio Statistics , 1998 .

[30]  Man-Suk Oh Estimation of posterior density functions from a posterior sample , 1999 .

[31]  V. Johnson Posterior Distributions on Normalizing Constants , 1999 .

[32]  Adrian E. Raftery,et al.  Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .

[33]  A. Raftery,et al.  Bayesian Information Criterion for Censored Survival Models , 2000, Biometrics.

[34]  D. Madigan,et al.  Correction to: ``Bayesian model averaging: a tutorial'' [Statist. Sci. 14 (1999), no. 4, 382--417; MR 2001a:62033] , 2000 .

[35]  M. Stephens Dealing with label switching in mixture models , 2000 .

[36]  C. Robert,et al.  Computational and Inferential Difficulties with Mixture Posterior Distributions , 2000 .

[37]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[38]  J. Hodges,et al.  Counting degrees of freedom in hierarchical and other richly-parameterised models , 2001 .

[39]  G. Nicholls,et al.  Bridge estimation of the probability density at a point , 2001 .

[40]  Edward I. George,et al.  The Practical Implementation of Bayesian Model Selection , 2001 .

[41]  J H Albert,et al.  Sequential Ordinal Modeling with Applications to Survival Data , 2001, Biometrics.

[42]  S. Godsill On the Relationship Between Markov chain Monte Carlo Methods for Model Uncertainty , 2001 .

[43]  S. Chib,et al.  Marginal Likelihood From the Metropolis–Hastings Output , 2001 .

[44]  T. Louis,et al.  BAYES AND EMPIRICAL BAYES METHODS FOR DATA ANALYSIS , 1996, Stat. Comput..

[45]  Aki Vehtari Discussion to "Bayesian measures of model complexity and fit" by Spiegelhalter, D.J., Best, N.G., Carlin, B.P., and van der Linde, A. , 2002 .

[46]  Peter E. Rossi,et al.  Bayesian Statistics and Marketing , 2005 .

[47]  Hyunjoong Kim,et al.  Marginal Likelihood for a Class of Bayesian Generalized Linear Models , 2002 .

[48]  Charles S. Bos A Comparison of Marginal Likelihood Computation Methods , 2002, COMPSTAT.

[49]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[50]  Robert L. Wolpert Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence , 2002 .

[51]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[52]  N. Shephard,et al.  Markov chain Monte Carlo methods for stochastic volatility models , 2002 .

[53]  M. Clyde,et al.  Model Uncertainty , 2003 .

[54]  J. Ghosh,et al.  Approximations and consistency of Bayes factors as model dimension grows , 2003 .

[55]  N. Reid Asymptotics and the theory of inference , 2003 .

[56]  S. Chib,et al.  Marginal Likelihood and Bayes Factors for Dirichlet Process Mixture Models , 2003 .

[57]  A. Mira,et al.  Efficient Bayes factor estimation from the reversible jump output , 2006 .

[58]  Ming-Hui Chen,et al.  Computing marginal likelihoods from a single MCMC output , 2005 .

[59]  A. Brix Bayesian Data Analysis, 2nd edn , 2005 .

[60]  Mark J. Schervish,et al.  MCMC Strategies for Computing Bayesian Predictive Densities for Censored Multivariate Data , 2005 .

[61]  Tom A. B. Snijders,et al.  Model selection in random effects models for directed graphs using approximated Bayes factors , 2005 .

[62]  F. Vaida,et al.  Conditional Akaike information for mixed-effects models , 2005 .

[63]  Andrew Gelman,et al.  R2WinBUGS: A Package for Running WinBUGS from R , 2005 .

[64]  I. C. Gormley Exploring Heterogeneity In Irish Voting Data : A Mixture Modelling Approach ∗ , 2005 .

[65]  H. Stern,et al.  An Empirical Comparison of Methods for Computing Bayes Factors in Generalized Linear Mixed Models , 2005 .

[66]  C. Robert,et al.  Deviance information criteria for missing data models , 2006 .

[67]  Adrian E. Raftery,et al.  Computing Normalizing Constants for Finite Mixture Models via Incremental Mixture Importance Sampling (IMIS) , 2006 .

[68]  Thomas Brendan Murphy,et al.  A Latent Space Model for Rank Data , 2006, SNA@ICML.

[69]  P. Fearnhead,et al.  An exact Gibbs sampler for the Markov‐modulated Poisson process , 2006 .

[70]  Tony O’Hagan Bayes factors , 2006 .

[71]  I. C. Gormley,et al.  Analysis of Irish third‐level college applications data , 2006 .

[72]  A. Raftery,et al.  Model‐based clustering for social networks , 2007 .

[73]  Nicholas G. Polson,et al.  MCMC maximum likelihood for latent state models , 2007 .

[74]  B. Carlin,et al.  Measuring the complexity of generalized linear hierarchical models , 2007 .