论文信息 - High-Order Langevin Diffusion Yields an Accelerated MCMC Algorithm - 字舞流文

High-Order Langevin Diffusion Yields an Accelerated MCMC Algorithm

We propose a Markov chain Monte Carlo (MCMC) algorithm based on third-order Langevin dynamics for sampling from distributions with log-concave and smooth densities. The higher-order dynamics allow for more flexible discretization schemes, and we develop a specific method that combines splitting with more accurate integration. For a broad class of $d$-dimensional distributions arising from generalized linear models, we prove that the resulting third-order algorithm produces samples from a distribution that is at most $\varepsilon > 0$ in Wasserstein distance from the target distribution in $O\left(\frac{d^{1/3}}{ \varepsilon^{2/3}} \right)$ steps. This result requires only Lipschitz conditions on the gradient. For general strongly convex potentials with $\alpha$-th order smoothness, we prove that the mixing time scales as $O \left(\frac{d^{1/3}}{\varepsilon^{2/3}} + \frac{d^{1/2}}{\varepsilon^{1/(\alpha - 1)}} \right)$.

Martin J. Wainwright | Michael I. Jordan | Peter L. Bartlett | Yi-An Ma | Wenlong Mou | P. Bartlett | M. Wainwright | Yian Ma | Yi-An Ma | Wenlong Mou

[1] M. Bartholomew-Biggs,et al. Some effective methods for unconstrained optimization based on the solution of systems of ordinary differential equations , 1989 .

[2] G. Stewart. Afternotes goes to graduate school : lectures on advanced numerical analysis : a series of lectures on advanced numerical analysis presented at the University of Maryland at College Park and recorded after the fact , 1998 .

[3] M. Ledoux. The geometry of Markov diffusion generators , 1998 .

[4] J. Rosenthal,et al. Optimal scaling for various Metropolis-Hastings algorithms , 2001 .

[5] Arnold Neumaier,et al. Introduction to Numerical Analysis , 2001 .

[6] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[7] C. Villani. Optimal Transport: Old and New , 2008 .

[8] Radford M. Neal. MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[9] A. Dalalyan. Theoretical guarantees for approximate sampling from smooth and log‐concave densities , 2014, 1412.7392.

[10] Tianqi Chen,et al. A Complete Recipe for Stochastic Gradient MCMC , 2015, NIPS.

[11] Assyr Abdulle,et al. Long Time Accuracy of Lie-Trotter Splitting Methods for Langevin Dynamics , 2015, SIAM J. Numer. Anal..

[12] Stephen P. Boyd,et al. A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights , 2014, J. Mach. Learn. Res..

[13] Michael I. Jordan,et al. A Lyapunov Analysis of Momentum Methods in Optimization , 2016, ArXiv.

[14] BENEDICT LEIMKUHLER,et al. Adaptive Thermostats for Noisy Gradient Systems , 2015, SIAM J. Sci. Comput..

[15] Oren Mangoubi,et al. Rapid Mixing of Hamiltonian Monte Carlo on Strongly Log-Concave Distributions , 2017, 1708.07114.

[16] Santosh S. Vempala,et al. Algorithmic Theory of ODEs and Sampling from Well-conditioned Logconcave Densities , 2018, ArXiv.

[17] Michael I. Jordan,et al. On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo , 2018, ICML.

[18] Michael I. Jordan,et al. Underdamped Langevin MCMC: A non-asymptotic analysis , 2017, COLT.

[19] Peter L. Bartlett,et al. Convergence of Langevin MCMC in KL-divergence , 2017, ALT.

[20] Nisheeth K. Vishnoi,et al. Dimensionally Tight Running Time Bounds for Second-Order Hamiltonian Monte Carlo , 2018, ArXiv.

[21] Michael I. Jordan,et al. Sharp Convergence Rates for Langevin Dynamics in the Nonconvex Setting , 2018, ArXiv.

[22] Arnak S. Dalalyan,et al. On sampling from a log-concave density using kinetic Langevin diffusions , 2018, Bernoulli.

[23] Martin J. Wainwright,et al. Log-concave sampling: Metropolis-Hastings algorithms are fast! , 2018, COLT.

[24] Michael I. Jordan,et al. Is There an Analog of Nesterov Acceleration for MCMC? , 2019, ArXiv.

[25] Arnak S. Dalalyan,et al. User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient , 2017, Stochastic Processes and their Applications.

[26] Michael I. Jordan,et al. Sampling can be faster than optimization , 2018, Proceedings of the National Academy of Sciences.

[27] Alain Durmus,et al. High-dimensional Bayesian inference via the unadjusted Langevin algorithm , 2016, Bernoulli.

[28] Yin Tat Lee,et al. The Randomized Midpoint Method for Log-Concave Sampling , 2019, NeurIPS.

[29] Lei Wu,et al. Irreversible samplers from jump and continuous Markov processes , 2016, Stat. Comput..

[30] A. Eberle,et al. Coupling and convergence for Hamiltonian Monte Carlo , 2018, The Annals of Applied Probability.

[31] Martin J. Wainwright,et al. Fast mixing of Metropolized Hamiltonian Monte Carlo: Benefits of multi-step gradients , 2019, J. Mach. Learn. Res..

[32] Yu Cao,et al. Complexity of randomized algorithms for underdamped Langevin dynamics , 2020, Communications in Mathematical Sciences.

[33] Michael I. Jordan,et al. Understanding the acceleration phenomenon via high-resolution differential equations , 2018, Mathematical Programming.