暂无分享,去创建一个
[1] Rémi Munos,et al. Spectral Thompson Sampling , 2014, AAAI.
[2] J. Meigs,et al. WHO Technical Report , 1954, The Yale Journal of Biology and Medicine.
[3] Aurélien Garivier,et al. Non-Asymptotic Sequential Tests for Overlapping Hypotheses and application to near optimal arm identification in bandit models , 2019 .
[4] Noga Alon,et al. From Bandits to Experts: A Tale of Domination and Independence , 2013, NIPS.
[5] Daniel Russo,et al. Simple Bayesian Algorithms for Best Arm Identification , 2016, COLT.
[6] Shie Mannor,et al. From Bandits to Experts: On the Value of Side-Observations , 2011, NIPS.
[7] Rémi Munos,et al. Efficient learning by implicit exploration in bandit problems with side observations , 2014, NIPS.
[8] Sébastien Bubeck,et al. Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..
[9] Wouter M. Koolen,et al. Pure Exploration with Multiple Correct Answers , 2019, NeurIPS.
[10] Alessandro Lazaric,et al. Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence , 2012, NIPS.
[11] Noga Alon,et al. Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback , 2014, SIAM J. Comput..
[12] Michal Valko,et al. Online Learning with Noisy Side Observations , 2016, AISTATS.
[13] Aurélien Garivier,et al. Optimal Best Arm Identification with Fixed Confidence , 2016, COLT.
[14] R. Munos,et al. Spectral bandits , 2020 .
[15] Rémi Munos,et al. Spectral Bandits for Smooth Graph Functions , 2014, ICML.
[16] Wouter M. Koolen,et al. Non-Asymptotic Pure Exploration by Solving Games , 2019, NeurIPS.
[17] Walter T. Federer,et al. Sequential Design of Experiments , 1967 .
[18] Michal Valko,et al. Online learning with Erdos-Renyi side-observation graphs , 2016, UAI.
[19] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..