On the Performance of Thompson Sampling on Logistic Bandits
暂无分享,去创建一个
[1] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[2] P. Erdös. On an extremal problem in graph theory , 1970 .
[3] Robert E. Tarjan,et al. Finding a Maximum Independent Set , 1976, SIAM J. Comput..
[4] K. Böröczky. Finite Packing and Covering , 2004 .
[5] Etsuji Tomita,et al. An Efficient Branch-and-bound Algorithm for Finding a Maximum Clique with Computational Experiments , 2001, J. Glob. Optim..
[6] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[7] Aurélien Garivier,et al. Parametric Bandits: The Generalized Linear Case , 2010, NIPS.
[8] Benjamin Van Roy,et al. Eluder Dimension and the Sample Complexity of Optimistic Exploration , 2013, NIPS.
[9] Benjamin Van Roy,et al. Learning to Optimize via Posterior Sampling , 2013, Math. Oper. Res..
[10] Benjamin Van Roy,et al. Learning to Optimize via Information-Directed Sampling , 2014, NIPS.
[11] Benjamin Van Roy,et al. An Information-Theoretic Analysis of Thompson Sampling , 2014, J. Mach. Learn. Res..
[12] Sébastien Bubeck,et al. Multi-scale exploration of convex functions and bandit convex optimization , 2015, COLT.
[13] Alessandro Lazaric,et al. Linear Thompson Sampling Revisited , 2016, AISTATS.
[14] Lihong Li,et al. Provable Optimal Algorithms for Generalized Linear Contextual Bandits , 2017, ArXiv.
[15] Fang Liu,et al. Information Directed Sampling for Stochastic Bandits with Graph Feedback , 2017, AAAI.
[16] Shi Dong,et al. An Information-Theoretic Analysis for Thompson Sampling with Many Actions , 2018, NeurIPS.
[17] Benjamin Van Roy,et al. Satisficing in Time-Sensitive Bandit Learning , 2018, Math. Oper. Res..