Rate-Optimal Bayesian Simple Regret in Best Arm Identification
暂无分享,去创建一个
[1] Junpei Komiyama,et al. Minimax Optimal Algorithms for Fixed-Budget Best Arm Identification , 2022, Neural Information Processing Systems.
[2] Daniel Russo,et al. Adaptive Experimentation in the Presence of Exogenous Nonstationary Variation , 2022, 2202.09036.
[3] Sattar Vakili,et al. Optimal Order Simple Regret for Gaussian Process Bandits , 2021, NeurIPS.
[4] L. J. Hong,et al. Review on ranking and selection: A new perspective , 2020, Frontiers of Engineering Management.
[5] Michal Valko,et al. Fixed-Confidence Guarantees for Bayesian Best-Arm Identification , 2019, AISTATS.
[6] Daniel Russo,et al. A Note on the Equivalence of Upper Confidence Bounds and Gittins Indices for Patient Agents , 2019, Oper. Res..
[7] Recent Advances in Optimization and Modeling of Contemporary Problems , 2018 .
[8] P. Frazier. Bayesian Optimization , 2018, Hyperparameter Optimization in Machine Learning.
[9] Diego Klabjan,et al. Improving the Expected Improvement Algorithm , 2017, NIPS.
[10] Vahid Tarokh,et al. On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits , 2016, IEEE Transactions on Signal Processing.
[11] Alexandra Carpentier,et al. Tight (Lower) Bounds for the Fixed Budget Best Arm Identification Bandit Problem , 2016, COLT.
[12] Ilya O. Ryzhov,et al. On the Convergence Rates of Expected Improvement Methods , 2016, Oper. Res..
[13] Daniel Russo,et al. Simple Bayesian Algorithms for Best Arm Identification , 2016, COLT.
[14] Chun-Hung Chen,et al. Dynamic Sampling Allocation and Design Selection , 2016, INFORMS J. Comput..
[15] Emilie Kaufmann,et al. Analysis of bayesian and frequentist strategies for sequential resource allocation. (Analyse de stratégies bayésiennes et fréquentistes pour l'allocation séquentielle de ressources) , 2014 .
[16] Aurélien Garivier,et al. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..
[17] Benjamin Van Roy,et al. Learning to Optimize via Information-Directed Sampling , 2014, NIPS.
[18] Rémi Munos,et al. Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis , 2012, ALT.
[19] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[20] Rémi Munos,et al. Pure exploration in finitely-armed and continuous-armed bandits , 2011, Theor. Comput. Sci..
[21] Adam D. Bull,et al. Convergence Rates of Efficient Global Optimization Algorithms , 2011, J. Mach. Learn. Res..
[22] R. Munos,et al. Best Arm Identification in Multi-Armed Bandits , 2010, COLT 2010.
[23] Ole-Christoffer Granmo,et al. A Bayesian Learning Automaton for Solving Two-Armed Bernoulli Bandit Problems , 2008, 2008 Seventh International Conference on Machine Learning and Applications.
[24] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..
[25] Peter W. Glynn,et al. A large deviations perspective on ordinal optimization , 2004, Proceedings of the 2004 Winter Simulation Conference, 2004..
[26] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[27] Chun-Hung Chen,et al. Simulation Budget Allocation for Further Enhancing the Efficiency of Ordinal Optimization , 2000, Discret. Event Dyn. Syst..
[28] Andrew W. Moore,et al. The Racing Algorithm: Model Selection for Lazy Learners , 1997, Artificial Intelligence Review.
[29] R. Weber. On the Gittins Index for Multiarmed Bandits , 1992 .
[30] Yu-Chi Ho,et al. Ordinal optimization of DEDS , 1992, Discret. Event Dyn. Syst..
[31] T. Lai. Adaptive treatment allocation and the multi-armed bandit problem , 1987 .
[32] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .
[33] Walter T. Federer,et al. Sequential Design of Experiments , 1967 .
[34] E. Paulson. A Sequential Procedure for Selecting the Population with the Largest Mean from $k$ Normal Populations , 1964 .
[35] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[36] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[37] E. Kaufmann,et al. Fixed-Con dence Guarantees for Bayesian Best-Arm Identi cation , 2020 .
[38] Akimichi Takemura,et al. Non-asymptotic analysis of a new bandit algorithm for semi-bounded rewards , 2015, J. Mach. Learn. Res..
[39] Christian M. Ernst,et al. Multi-armed Bandit Allocation Indices , 1989 .
[40] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .