X-Armed Bandits
暂无分享,去创建一个
[1] Rémi Munos,et al. Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.
[2] H. Jaap van den Herik,et al. Progressive Strategies for Monte-Carlo Tree Search , 2008 .
[3] Yngvi Björnsson,et al. Simulation-Based Approach to General Game Playing , 2008, AAAI.
[4] Sylvain Gelly,et al. Achieving master level play in 9×9 computer go , 2008, AAAI 2008.
[5] Robert D. Kleinberg,et al. Multi-armed bandits in metric spaces , 2008, STOC.
[6] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.
[7] Peter Auer,et al. Improved Rates for the Stochastic Continuum-Armed Bandit Problem , 2007, COLT.
[8] Pierre-Arnaud Coquelin,et al. Bandit Algorithms for Tree Search , 2007, UAI.
[9] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[10] G. Lugosi,et al. Prediction, learning, and games , 2006 .
[11] Robert D. Kleinberg. Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.
[12] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[13] R. Agrawal. Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.
[14] R. Agrawal. The Continuum-Armed Bandit Problem , 1995 .
[15] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[16] Elad Hazan,et al. Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization , 2008, COLT.
[17] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .
[18] N. Fisher,et al. Probability Inequalities for Sums of Bounded Random Variables , 1994 .
[19] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .