On Explore-Then-Commit strategies
暂无分享,去创建一个
Tor Lattimore | Aurélien Garivier | Emilie Kaufmann | Aurélien Garivier | Tor Lattimore | E. Kaufmann
[1] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[2] J. Andel. Sequential Analysis , 2022, The SAGE Encyclopedia of Research Design.
[3] R. Khan,et al. Sequential Tests of Statistical Hypotheses. , 1972 .
[4] H. Robbins,et al. SEQUENTIAL DESIGN OF COMPARATIVE CLINICAL TRIALS , 1983 .
[5] T. Lai. Adaptive treatment allocation and the multi-armed bandit problem , 1987 .
[6] H Robbins,et al. Sequential choice from several populations. , 1995, Proceedings of the National Academy of Sciences of the United States of America.
[7] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[8] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..
[9] A. Hoorfar,et al. INEQUALITIES ON THE LAMBERTW FUNCTION AND HYPERPOWER FUNCTION , 2008 .
[10] Jean-Yves Audibert,et al. Minimax Policies for Adversarial and Stochastic Bandits. , 2009, COLT 2009.
[11] Peter Auer,et al. UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem , 2010, Period. Math. Hung..
[12] R. Munos,et al. Kullback–Leibler upper confidence bounds for optimal sequential allocation , 2012, 1210.1136.
[13] Vianney Perchet,et al. Bounded regret in stochastic multi-armed bandits , 2013, COLT.
[14] Aurélien Garivier,et al. On the Complexity of A/B Testing , 2014, COLT.
[15] Sébastien Bubeck,et al. Prior-free and prior-dependent regret bounds for Thompson Sampling , 2013, 2014 48th Annual Conference on Information Sciences and Systems (CISS).
[16] Vianney Perchet,et al. Batched Bandit Problems , 2015, COLT.
[17] Tor Lattimore,et al. Optimally Confident UCB : Improved Regret for Finite-Armed Bandits , 2015, ArXiv.
[18] Aurélien Garivier,et al. Optimal Best Arm Identification with Fixed Confidence , 2016, COLT.
[19] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .