On Approximate Thompson Sampling with Langevin Algorithms
暂无分享,去创建一个
Michael I. Jordan | Peter L. Bartlett | Yi-An Ma | Aldo Pacchiano | Eric Mazumdar | Eric V. Mazumdar | P. Bartlett | Yi-An Ma | Aldo Pacchiano
[1] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[2] S. Basu,et al. The Mean, Median, and Mode of Unimodal Distributions:A Characterization , 1997 .
[3] S. Shreve,et al. Stochastic differential equations , 1955, Mathematical Proceedings of the Cambridge Philosophical Society.
[4] M. Ledoux. Concentration of measure and logarithmic Sobolev inequalities , 1999 .
[5] A. V. D. Vaart,et al. Convergence rates of posterior distributions , 2000 .
[6] L. Wasserman,et al. Rates of convergence of posterior distributions , 2001 .
[7] M. Ledoux. The concentration of measure phenomenon , 2001 .
[8] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[9] Yaofeng Ren. On the Burkholder-Davis-Gundy inequalities for continuous martingales , 2008 .
[10] C. Villani. Optimal Transport: Old and New , 2008 .
[11] Van Der Vaart,et al. Rates of contraction of posterior distributions based on Gaussian process priors , 2008 .
[12] Steven L. Scott,et al. A modern Bayesian look at the multi-armed bandit , 2010 .
[13] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.
[14] Yee Whye Teh,et al. Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.
[15] Rémi Munos,et al. Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis , 2012, ALT.
[16] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[17] Shipra Agrawal,et al. Further Optimal Regret Bounds for Thompson Sampling , 2012, AISTATS.
[18] Shipra Agrawal,et al. Thompson Sampling for Contextual Bandits with Linear Payoffs , 2012, ICML.
[19] Rémi Munos,et al. Thompson Sampling for 1-Dimensional Exponential Family Bandits , 2013, NIPS.
[20] J. Wellner,et al. Log-Concavity and Strong Log-Concavity: a review. , 2014, Statistics surveys.
[21] Shie Mannor,et al. Thompson Sampling for Complex Online Problems , 2013, ICML.
[22] Tianqi Chen,et al. A Complete Recipe for Stochastic Gradient MCMC , 2015, NIPS.
[23] É. Moulines,et al. Non-asymptotic convergence analysis for the Unadjusted Langevin Algorithm , 2015, 1507.05021.
[24] É. Moulines,et al. Sampling from a strongly log-concave distribution with the Unadjusted Langevin Algorithm , 2016 .
[25] C. Gomez-Uribe. Online Algorithms For Parameter Mean And Variance Estimation In Dynamic Regression Models , 2016, 1605.05697.
[26] Benjamin Van Roy,et al. An Information-Theoretic Analysis of Thompson Sampling , 2014, J. Mach. Learn. Res..
[27] Benjamin Van Roy,et al. Ensemble Sampling , 2017, NIPS.
[28] Alessandro Lazaric,et al. Linear Thompson Sampling Revisited , 2016, AISTATS.
[29] Iñigo Urteaga,et al. Variational inference for the multi-armed contextual bandit , 2017, AISTATS.
[30] Peter L. Bartlett,et al. Convergence of Langevin MCMC in KL-divergence , 2017, ALT.
[31] Jasper Snoek,et al. Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling , 2018, ICLR.
[32] A. V. D. Vaart,et al. CONVERGENCE RATES OF POSTERIOR DISTRIBUTIONS FOR NONIID OBSERVATIONS By , 2018 .
[33] Arnak S. Dalalyan,et al. User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient , 2017, Stochastic Processes and their Applications.
[34] Michael I. Jordan,et al. Sampling can be faster than optimization , 2018, Proceedings of the National Academy of Sciences.
[35] Yasin Abbasi-Yadkori,et al. Thompson Sampling and Approximate Inference , 2019, NeurIPS.
[36] Michael I. Jordan,et al. A Diffusion Process Perspective on Posterior Contraction Rates for Parameters , 2019, 1909.00966.
[37] Santosh S. Vempala,et al. Rapid Convergence of the Unadjusted Langevin Algorithm: Isoperimetry Suffices , 2019, NeurIPS.
[38] Michael I. Jordan,et al. A Short Note on Concentration Inequalities for Random Vectors with SubGaussian Norm , 2019, ArXiv.
[39] Csaba Szepesvari,et al. Bandit Algorithms , 2020 .
[40] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .