Thompson Sampling for 1-Dimensional Exponential Family Bandits
暂无分享,去创建一个
[1] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[2] H. Jeffreys. An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.
[3] A. V. D. Vaart,et al. Asymptotic Statistics: U -Statistics , 1998 .
[4] Gábor Lugosi,et al. Concentration Inequalities , 2008, COLT.
[5] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[6] Larry Wasserman,et al. All of Statistics: A Concise Course in Statistical Inference , 2004 .
[7] Akimichi Takemura,et al. An Asymptotically Optimal Bandit Algorithm for Bounded Support Models. , 2010, COLT 2010.
[8] Aurélien Garivier,et al. The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond , 2011, COLT.
[9] Rémi Munos,et al. Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis , 2012, ALT.
[10] David S. Leslie,et al. Optimistic Bayesian Sampling in Contextual-Bandit Problems , 2012, J. Mach. Learn. Res..
[11] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[12] R. Munos,et al. Kullback–Leibler upper confidence bounds for optimal sequential allocation , 2012, 1210.1136.
[13] Sébastien Bubeck,et al. A note on the Bayesian regret of Thompson Sampling with an arbitrary prior , 2013, ArXiv.
[14] Shipra Agrawal,et al. Further Optimal Regret Bounds for Thompson Sampling , 2012, AISTATS.
[15] Shipra Agrawal,et al. Thompson Sampling for Contextual Bandits with Linear Payoffs , 2012, ICML.
[16] Benjamin Van Roy,et al. Learning to Optimize via Posterior Sampling , 2013, Math. Oper. Res..
[17] Sébastien Bubeck,et al. Prior-free and prior-dependent regret bounds for Thompson Sampling , 2013, 2014 48th Annual Conference on Information Sciences and Systems (CISS).
[18] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .