Constructing Sampling Schemes via Coupling: Markov Semigroups and Optimal Transport

In this paper we develop a general framework for constructing and analysing coupled Markov chain Monte Carlo samplers, allowing for both (possibly degenerate) diffusion and piecewise deterministic Markov processes. For many performance criteria of interest, including the asymptotic variance, the task of finding efficient couplings can be phrased in terms of problems related to optimal transport theory. We investigate general structural properties, proving a singularity theorem that has both geometric and probabilistic interpretations. Moreover, we show that those problems can often be solved approximately and support our findings with numerical experiments. For the particular objective of estimating the variance of a Bayesian posterior, our analysis suggests using novel techniques in the spirit of antithetic variates. Addressing the convergence to equilibrium of coupled processes we furthermore derive a modified Poincare inequality.

[1]  G. Metafune,et al.  Feller semigroups and invariant measures , 2010 .

[2]  A. Doucet,et al.  Particle Markov chain Monte Carlo methods , 2010 .

[3]  R. Bhattacharya On the functional central limit theorem and the law of the iterated logarithm for Markov processes , 1982 .

[4]  Alain Durmus,et al.  Piecewise deterministic Markov processes and their invariant measures , 2018, Annales de l'Institut Henri Poincaré, Probabilités et Statistiques.

[5]  C. Hwang,et al.  Accelerating diffusions , 2005, math/0505245.

[6]  L. Zanelli,et al.  Mathematical methods of Quantum Mechanics , 2017 .

[7]  Ajay Jasra,et al.  Antithetic Methods for Gibbs Samplers , 2009 .

[8]  Radford M. Neal,et al.  Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation , 1995, Learning in Graphical Models.

[9]  J. Snyder Coupling , 1998, Critical Inquiry.

[10]  O. Kallenberg Foundations of Modern Probability , 2021, Probability Theory and Stochastic Modelling.

[11]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[12]  Dirk P. Kroese,et al.  Handbook of Monte Carlo Methods , 2011 .

[13]  A. Mijatović,et al.  On the Poisson equation for Metropolis–Hastings chains , 2015, Bernoulli.

[14]  G. Roberts,et al.  Ergodicity of the zigzag process , 2017, The Annals of Applied Probability.

[15]  A. Doucet,et al.  Gibbs flow for approximate transport with applications to Bayesian computation , 2015, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[16]  Konstantinos Spiliopoulos,et al.  Improving the Convergence of Reversible Samplers , 2016 .

[17]  A. Doucet,et al.  The Bouncy Particle Sampler: A Nonreversible Rejection-Free Markov Chain Monte Carlo Method , 2015, 1510.02451.

[18]  G. Pagès,et al.  Invariant measure of duplicated diffusions and application to Richardson-Romberg extrapolation , 2013, 1302.1651.

[19]  V. Johnson Studying Convergence of Markov Chain Monte Carlo Algorithms Using Coupled Sample Paths , 1996 .

[20]  L. Lorenzi,et al.  Analytical Methods for Markov Semigroups , 2006 .

[21]  Sean P. Meyn,et al.  A Liapounov bound for solutions of the Poisson equation , 1996 .

[22]  Mark H. A. Davis Piecewise‐Deterministic Markov Processes: A General Class of Non‐Diffusion Stochastic Models , 1984 .

[23]  G. Pavliotis,et al.  Variance Reduction Using Nonreversible Langevin Samplers , 2015, Journal of statistical physics.

[24]  Stefan Heinrich,et al.  Multilevel Monte Carlo Methods , 2001, LSSC.

[25]  Jo Graham,et al.  Old and new , 2000 .

[26]  C. Batty,et al.  ONE-PARAMETER SEMIGROUPS OF POSITIVE OPERATORS (Lecture Notes in Mathematics 1184) , 1987 .

[27]  T. Sullivan Introduction to Uncertainty Quantification , 2015 .

[28]  G. Pavliotis,et al.  Using Perturbed Underdamped Langevin Dynamics to Efficiently Sample from Probability Distributions , 2017, Journal of Statistical Physics.

[29]  Adam Bowditch Stochastic Analysis , 2013 .

[30]  R. Nagel,et al.  One-parameter Semigroups of Positive Operators , 1986 .

[31]  Codina Cotar,et al.  Density Functional Theory and Optimal Transportation with Coulomb Cost , 2011, 1104.0603.

[32]  M. Ledoux,et al.  Analysis and Geometry of Markov Diffusion Operators , 2013 .

[33]  S. Sénécal,et al.  Forward Event-Chain Monte Carlo: a general rejection-free and irreversible Markov chain simulation method , 2017 .

[34]  David Wilson,et al.  Coupling from the past: A user's guide , 1997, Microsurveys in Discrete Probability.

[35]  B. Leimkuhler,et al.  The computation of averages from equilibrium and nonequilibrium Langevin molecular dynamics , 2013, 1308.5814.

[36]  F. Kühn Existence of (Markovian) solutions to martingale problems associated with L\'evy-type operators , 2018 .

[37]  R. Nagel,et al.  One-parameter semigroups for linear evolution equations , 1999 .

[38]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[39]  A. Duncan,et al.  Limit theorems for the zig-zag process , 2016, Advances in Applied Probability.

[40]  Ralph C. Smith,et al.  Uncertainty Quantification: Theory, Implementation, and Applications , 2013 .

[41]  Jonathan C. Mattingly,et al.  Yet Another Look at Harris’ Ergodic Theorem for Markov Chains , 2008, 0810.2777.

[42]  P. Fearnhead,et al.  The Zig-Zag process and super-efficient sampling for Bayesian analysis of big data , 2016, The Annals of Statistics.

[43]  W. Kliemann Recurrence and invariant measures for degenerate diffusions , 1987 .

[44]  M. Rousset,et al.  An Interacting Particle System Approach for Molecular Dynamics , 2005 .

[45]  Luc Rey-Bellet,et al.  Uncertainty quantification for generalized Langevin dynamics. , 2016, The Journal of chemical physics.

[46]  P. Courrège Sur la forme intégro-différentielle des opérateurs de $C^\infty _k$ dans $C$ satisfaisant au principe du maximum , 1966 .

[47]  B. Jourdain,et al.  Computation of sensitivities for the invariant measure of a parameter dependent diffusion , 2015, 1509.01348.

[48]  Mark H. Davis Markov Models and Optimization , 1995 .

[49]  Jørund Gåsemyr,et al.  Antithetic coupling of two Gibbs sampler chains , 2000 .

[50]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[51]  Benedict J. Leimkuhler,et al.  Ensemble preconditioning for Markov chain Monte Carlo simulation , 2016, Statistics and Computing.

[52]  Houman Owhadi,et al.  Handbook of Uncertainty Quantification , 2017 .

[53]  Pierre Del Moral,et al.  Mean Field Simulation for Monte Carlo Integration , 2013 .

[54]  AN INTERACTING PARTICLE APPROACH SYSTEM APPROACH FOR MOLECULAR DYNAMICS , 2005 .

[55]  J. M. Sanz-Serna,et al.  Randomized Hamiltonian Monte Carlo , 2015, 1511.09382.

[56]  C. Villani Topics in Optimal Transportation , 2003 .

[57]  R. Pinnau,et al.  A consensus-based model for global optimization and its mean-field limit , 2016, 1604.05648.

[58]  Radu V. Craiu,et al.  Multiprocess parallel antithetic coupling for backward and forward Markov Chain Monte Carlo , 2005, math/0505631.

[59]  A. Eberle Couplings, distances and contractivity for diffusion processes revisited , 2013 .

[60]  Do Young Eun,et al.  An antithetic coupling approach to multi-chain based CSMA scheduling algorithms , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[61]  H. Thorisson Coupling, stationarity, and regeneration , 2000 .

[62]  Gabriel Peyré,et al.  Computational Optimal Transport , 2018, Found. Trends Mach. Learn..

[63]  R. Schilling Conservativeness and Extensions of Feller Semigroups , 1998 .

[64]  Niels Jacob,et al.  Pseudo-Differential Operators and Markov Processes , 1996 .

[65]  M. Manhart,et al.  Markov Processes , 2018, Introduction to Stochastic Processes and Simulation.

[66]  Mu-Fa Chen Eigenvalues, inequalities and ergodic theory , 2000 .

[67]  R. McCann,et al.  Rectifiability of Optimal Transportation Plans , 2010, Canadian Journal of Mathematics.

[68]  Gabriel Stoltz,et al.  Partial differential equations and stochastic methods in molecular dynamics* , 2016, Acta Numerica.

[69]  Radford M. Neal,et al.  Improving Markov chain Monte Carlo Estimators by Coupling to an Approximating Chain , 2001 .

[70]  Werner Krauth,et al.  Generalized event-chain Monte Carlo: constructing rejection-free global-balance algorithms from infinitesimal steps. , 2013, The Journal of chemical physics.

[71]  Colin R. Reeves,et al.  Genetic Algorithms—Principles and Perspectives , 2002, Operations Research/Computer Science Interfaces Series.

[72]  G. Burton TOPICS IN OPTIMAL TRANSPORTATION (Graduate Studies in Mathematics 58) By CÉDRIC VILLANI: 370 pp., US$59.00, ISBN 0-8218-3312-X (American Mathematical Society, Providence, RI, 2003) , 2004 .

[73]  T. Lindvall Lectures on the Coupling Method , 1992 .

[74]  K. Elworthy ERGODICITY FOR INFINITE DIMENSIONAL SYSTEMS (London Mathematical Society Lecture Note Series 229) By G. Da Prato and J. Zabczyk: 339 pp., £29.95, LMS Members' price £22.47, ISBN 0 521 57900 7 (Cambridge University Press, 1996). , 1997 .

[75]  A. Doucet,et al.  Piecewise-Deterministic Markov Chain Monte Carlo , 2017, 1707.05296.

[76]  Peter W. Glynn,et al.  Exact estimation for Markov chain equilibrium expectations , 2014, Journal of Applied Probability.

[77]  P. Dellaportas,et al.  Control variates for estimation based on reversible Markov chain Monte Carlo samplers , 2012 .

[78]  Michela Ottobre,et al.  Markov Chain Monte Carlo and Irreversibility , 2016 .

[79]  J. Heng,et al.  Unbiased Hamiltonian Monte Carlo with couplings , 2017, Biometrika.

[80]  O. Barndorff-Nielsen,et al.  Lévy Matters I , 2010 .

[81]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[82]  Dilin Wang,et al.  Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.

[83]  Christiane Lemieux,et al.  Acceleration of the Multiple-Try Metropolis algorithm using antithetic and stratified sampling , 2007, Stat. Comput..

[84]  S. Shreve,et al.  Stochastic differential equations , 1955, Mathematical Proceedings of the Cambridge Philosophical Society.

[85]  A. Eberle,et al.  Coupling and convergence for Hamiltonian Monte Carlo , 2018, The Annals of Applied Probability.

[86]  Tomasz Komorowski,et al.  Fluctuations in Markov Processes , 2012 .

[87]  G. Stoltz,et al.  Spectral methods for Langevin dynamics and associated error estimates , 2017, 1702.04718.

[88]  P. Baxendale Statistical Equilibrium and Two-Point Motion for a Stochastic Flow of Diffeomorphisms , 1991 .

[89]  Fabio Rigat,et al.  Parallel hierarchical sampling: A general-purpose interacting Markov chains Monte Carlo algorithm , 2012, Comput. Stat. Data Anal..

[90]  Colin J. Cotter,et al.  Probabilistic Forecasting and Bayesian Data Assimilation , 2015 .

[91]  Kevin K. Lin,et al.  Coupling control variates for Markov chain Monte Carlo , 2008, J. Comput. Phys..

[92]  M. V. D. Panne,et al.  Displacement Interpolation Using Lagrangian Mass Transport , 2011 .

[93]  V. Johnson A Coupling-Regeneration Scheme for Diagnosing Convergence in Markov Chain Monte Carlo Algorithms , 1998 .

[94]  I. Amemiya,et al.  On tensor products of Banach spaces , 1957 .

[95]  Radford M. Neal Circularly-Coupled Markov Chain Sampling , 2017, 1711.04399.

[96]  Paul Fearnhead,et al.  Piecewise Deterministic Markov Processes for Continuous-Time Monte Carlo , 2016, Statistical Science.