Divide and Conquer: A Mixture-Based Approach to Regional Adaptation for MCMC

The efficiency of Markov chain Monte Carlo (MCMC) algorithms can vary dramatically with the choice of simulation parameters. Adaptive MCMC (AMCMC) algorithms allow the automatic tuning of the parameters while the simulation is in progress. A multimodal target distribution may call for regional adaptation of Metropolis–Hastings samplers so that the proposal distribution varies across regions in the sample space. Establishing such a partition is not straightforward and, in many instances, the learning required for its specification takes place gradually, as the simulation proceeds. In the case in which the target distribution is approximated by a mixture of Gaussians, we propose an adaptation process for the partition. It involves fitting the mixture using the available samples via an online EM algorithm and, based on the current mixture parameters, constructing the regional adaptive algorithm with online recursion (RAPTOR). The method is compared with other regional AMCMC samplers and is tested on simulated as well as real data examples. Relevant theoretical proofs, code and datasets are posted as an online supplement.

[1]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[2]  H. Haario,et al.  An adaptive Metropolis algorithm , 2001 .

[3]  J. Rosenthal,et al.  Coupling and Ergodicity of Adaptive Markov Chain Monte Carlo Algorithms , 2007, Journal of Applied Probability.

[4]  Cristian Sminchisescu,et al.  Covariance scaled sampling for monocular 3D body tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[5]  P. Giordani,et al.  Adaptive Independent Metropolis–Hastings by Fast Estimation of Mixtures of Normals , 2008, 0801.1864.

[6]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[7]  P. Green,et al.  Corrigendum: On Bayesian analysis of mixtures with an unknown number of components , 1997 .

[8]  C. Robert,et al.  Controlled MCMC for Optimal Sampling , 2001 .

[9]  M J Small,et al.  Parametric distributions of regional lake chemistry: fitted and derived. , 1988, Environmental science & technology.

[10]  M. West Approximating posterior distributions by mixtures , 1993 .

[11]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[12]  C. Geyer,et al.  Annealing Markov chain Monte Carlo with applications to ancestral inference , 1995 .

[13]  Radford M. Neal Sampling from multimodal distributions using tempered transitions , 1996, Stat. Comput..

[14]  Carissa A. Sanchez,et al.  Determination of the frequency of loss of heterozygosity in esophageal adenocarcinoma by cell sorting, whole genome amplification and microsatellite polymorphisms. , 1996, Oncogene.

[15]  G. Warnes The Normal Kernel Coupler: An Adaptive Markov Chain Monte Carlo Method for Efficiently Sampling From Multi-Modal Distributions , 2001 .

[16]  P. Sen,et al.  Large sample methods in statistics , 1993 .

[17]  Gareth O. Roberts,et al.  Examples of Adaptive MCMC , 2009 .

[18]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[19]  Chao Yang,et al.  Learn From Thy Neighbor: Parallel-Chain Adaptive MCMC , 2008 .

[20]  S. Crawford An Application of the Laplace Method to Finite Mixture Distributions , 1994 .

[21]  Eric Moulines,et al.  On‐line expectation–maximization algorithm for latent data models , 2007, ArXiv.

[22]  C. Andrieu,et al.  On the ergodicity properties of some adaptive MCMC algorithms , 2006, math/0610317.

[23]  Eric Moulines,et al.  Stability of Stochastic Approximation under Verifiable Conditions , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[24]  Christophe Andrieu,et al.  A tutorial on adaptive MCMC , 2008, Stat. Comput..

[25]  J. Rosenthal,et al.  Optimal scaling for various Metropolis-Hastings algorithms , 2001 .

[26]  Heikki Haario,et al.  Componentwise adaptation for high dimensional MCMC , 2005, Comput. Stat..

[27]  Joseph G. Ibrahim,et al.  Monte Carlo Methods in Bayesian Computation , 2000 .

[28]  A. Gelman,et al.  Weak convergence and optimal scaling of random walk Metropolis algorithms , 1997 .

[29]  Chao Yang,et al.  Learn From Thy Neighbor: Parallel-Chain and Regional Adaptive MCMC , 2009 .

[30]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[31]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[32]  S. Kou,et al.  Equi-energy sampler with applications in statistical inference and statistical mechanics , 2005, math/0507080.

[33]  Cristian Sminchisescu,et al.  Hyperdynamics Importance Sampling , 2002, ECCV.

[34]  Stephen M. Krone,et al.  Small-world MCMC and convergence to multi-modal distributions: From slow mixing to fast mixing , 2007 .

[35]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .