Monte Carlo Approximation of Bayes Factors via Mixing with Surrogate Distributions

By mixing the posterior distribution with a surrogate distribution, of which the normalizing constant is tractable, we describe a new method to estimate the normalizing constant using the Wang-Landau algorithm. We then introduce an accelerated version of the proposed method using the momentum technique. In addition, several extensions are discussed, including (1) a parallel variant, which inserts a sequence of intermediate distributions between the posterior distribution and the surrogate distribution, to further improve the efficiency of the proposed method; (2) the use of the surrogate distribution to help detect potential multimodality of the posterior distribution, upon which a better sampler can be designed utilizing mode jumping algorithms; (3) a new jumping mechanism for general reversible jump Markov chain Monte Carlo algorithms that combines the Multiple-try Metropolis and the directional sampling algorithm, which can be used to estimate the normalizing constant when a surrogate distribution is difficult to come by. We illustrate the proposed methods on several statistical models, including the Log-Gaussian Cox process, the Bayesian Lasso, the logistic regression, the Gaussian mixture model, and the g-prior Bayesian variable selection.

[1]  Xiao-Li Meng,et al.  Simulating Normalizing Constants: From Importance Sampling to Bridge Sampling to Path Sampling , 1998 .

[2]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[3]  R. Belardinelli,et al.  Fast algorithm to calculate density of states. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Jun S. Liu,et al.  Sequential Imputations and Bayesian Missing Data Problems , 1994 .

[5]  Gersende Fort,et al.  Convergence of the Wang-Landau algorithm , 2015, Math. Comput..

[6]  Xiao-Li Meng,et al.  Fitting Full-Information Item Factor Models and an Empirical Investigation of Bridge Sampling , 1996 .

[7]  Dirk P. Kroese,et al.  Unbiased and consistent nested sampling via sequential Monte Carlo , 2018, 1805.03924.

[8]  J. Hammersley,et al.  Poor Man's Monte Carlo , 1954 .

[9]  Francesco Bartolucci,et al.  A generalized multiple-try version of the Reversible Jump algorithm , 2014, Comput. Stat. Data Anal..

[10]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[11]  Jun S. Liu,et al.  The Multiple-Try Method and Local Optimization in Metropolis Sampling , 2000 .

[12]  Chenggang Zhou,et al.  Understanding and improving the Wang-Landau algorithm. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Henrik Singmann,et al.  bridgesampling: An R Package for Estimating Normalizing Constants , 2017, Journal of Statistical Software.

[14]  Martin J. Wainwright,et al.  On the Computational Complexity of High-Dimensional Bayesian Variable Selection , 2015, ArXiv.

[15]  S. Chib,et al.  Marginal Likelihood From the Metropolis–Hastings Output , 2001 .

[16]  C. Geyer Markov Chain Monte Carlo Maximum Likelihood , 1991 .

[17]  D.G. Tzikas,et al.  The variational approximation for Bayesian inference , 2008, IEEE Signal Processing Magazine.

[18]  Dietrich Stoyan,et al.  Marked Point Processes in Forest Statistics , 1992, Forest Science.

[19]  P. Moral Feynman-Kac Formulae: Genealogical and Interacting Particle Systems with Applications , 2004 .

[20]  Ming-Hui Chen,et al.  Improving marginal likelihood estimation for Bayesian phylogenetic model selection. , 2011, Systematic biology.

[21]  R. Chelli,et al.  Annealed importance sampling with constant cooling rate. , 2015, The Journal of chemical physics.

[22]  C. Geyer Estimating Normalizing Constants and Reweighting Mixtures , 1994 .

[23]  Jun S. Liu,et al.  The Wang-Landau algorithm in general state spaces: Applications and convergence analysis , 2010 .

[24]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[25]  Berg,et al.  Multicanonical ensemble: A new approach to simulate first-order phase transitions. , 1992, Physical review letters.

[26]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[27]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[28]  C. Robert,et al.  Estimation of Finite Mixture Distributions Through Bayesian Sampling , 1994 .

[29]  Jun S. Liu,et al.  The Wang-Landau Algorithm as Stochastic Optimization and its Acceleration , 2019, Physical review. E.

[30]  Ming-Hui Chen,et al.  Choosing among Partition Models in Bayesian Phylogenetics , 2010, Molecular biology and evolution.

[31]  H. Tjelmeland,et al.  Mode Jumping Proposals in MCMC , 2001 .

[32]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[33]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[34]  G. C. McDonald,et al.  Instabilities of Regression Estimates Relating Air Pollution to Mortality , 1973 .

[35]  A. W. Rosenbluth,et al.  MONTE CARLO CALCULATION OF THE AVERAGE EXTENSION OF MOLECULAR CHAINS , 1955 .

[36]  Pierre Del Moral,et al.  Sequential Monte Carlo for rare event estimation , 2012, Stat. Comput..

[37]  Yan Zhou,et al.  Toward Automatic Model Comparison: An Adaptive Sequential Monte Carlo Approach , 2016 .

[38]  F. Liang A Generalized Wang–Landau Algorithm for Monte Carlo Computation , 2005 .

[39]  G. Roberts,et al.  Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions , 2003 .

[40]  W. Wong,et al.  Real-Parameter Evolutionary Monte Carlo With Applications to Bayesian Mixture Models , 2001 .

[41]  G. Parisi,et al.  Simulated tempering: a new Monte Carlo scheme , 1992, hep-lat/9205018.

[42]  N. Chopin A sequential particle filter method for static models , 2002 .

[43]  Jun S. Liu,et al.  Monte Carlo strategies in scientific computing , 2001 .

[44]  Simon J. Godsill,et al.  On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[45]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[46]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[47]  Frederick A Matsen,et al.  19 dubious ways to compute the marginal likelihood of a phylogenetic tree topology. , 2018, Systematic biology.

[48]  Liangliang Wang,et al.  An Annealed Sequential Monte Carlo Method for Bayesian Phylogenetics. , 2018, Systematic biology.

[49]  D. Landau,et al.  Efficient, multiple-range random walk algorithm to calculate the density of states. , 2000, Physical review letters.

[50]  C. Geyer,et al.  Annealing Markov chain Monte Carlo with applications to ancestral inference , 1995 .

[51]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[52]  Y. Ogata A Monte Carlo method for high dimensional integration , 1989 .

[53]  Arnaud Doucet,et al.  An Adaptive Interacting Wang–Landau Algorithm for Automatic Density Exploration , 2011, 1109.3829.

[54]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[55]  J. Heng,et al.  Unbiased Hamiltonian Monte Carlo with couplings , 2017, Biometrika.

[56]  S. Chib Marginal Likelihood from the Gibbs Output , 1995 .

[57]  Jun S. Liu,et al.  Sequential Monte Carlo methods for dynamic systems , 1997 .

[58]  P. Moral,et al.  Sequential Monte Carlo samplers , 2002, cond-mat/0212648.

[59]  Francisco Cuevas-Pacheco,et al.  Log Gaussian Cox processes on the sphere , 2018, Spatial Statistics.

[60]  Rong Chen,et al.  A Theoretical Framework for Sequential Importance Sampling with Resampling , 2001, Sequential Monte Carlo Methods in Practice.

[61]  Xiao-Li Meng,et al.  SIMULATING RATIOS OF NORMALIZING CONSTANTS VIA A SIMPLE IDENTITY: A THEORETICAL EXPLORATION , 1996 .