暂无分享,去创建一个
[1] Vahid Tarokh,et al. On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits , 2016, IEEE Transactions on Signal Processing.
[2] Peter Auer,et al. Improved Rates for the Stochastic Continuum-Armed Bandit Problem , 2007, COLT.
[3] Suleyman S. Kozat,et al. Minimax Optimal Algorithms for Adversarial Bandit Problem With Multiple Plays , 2019, IEEE Transactions on Signal Processing.
[4] Sivaraman Balakrishnan,et al. Optimization of Smooth Functions With Noisy Observations: Local Minimax Rates , 2018, IEEE Transactions on Information Theory.
[5] Yingcun Xia,et al. Bias‐corrected confidence bands in nonparametric regression , 1998 .
[6] Robert D. Kleinberg. Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.
[7] Qing Zhao,et al. Distributed Learning in Multi-Armed Bandit With Multiple Players , 2009, IEEE Transactions on Signal Processing.
[8] Adam Tauman Kalai,et al. Online convex optimization in the bandit setting: gradient descent without a gradient , 2004, SODA '05.
[9] Xin Fu,et al. Confidence bands in nonparametric regression , 2009 .
[10] Ohad Shamir,et al. On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization , 2012, COLT.
[11] Jörg Polzehl,et al. Simultaneous bootstrap confidence bands in nonparametric regression , 1998 .
[12] T. Lai. Adaptive treatment allocation and the multi-armed bandit problem , 1987 .
[13] R. Munos,et al. Kullback–Leibler upper confidence bounds for optimal sequential allocation , 2012, 1210.1136.
[14] Rémi Munos,et al. Algorithms for Infinitely Many-Armed Bandits , 2008, NIPS.
[15] Aleksandrs Slivkins,et al. Introduction to Multi-Armed Bandits , 2019, Found. Trends Mach. Learn..
[16] Cong Shen. Universal Best Arm Identification , 2019, IEEE Transactions on Signal Processing.
[17] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[18] Adam D. Bull,et al. Adaptive-treed bandits , 2013, 1302.2489.
[19] Zongwu Cai,et al. Weighted Nadaraya–Watson regression estimation , 2001 .
[20] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[21] Rémi Munos,et al. Optimistic Optimization of Deterministic Functions , 2011, NIPS 2011.
[22] Aurélien Garivier,et al. The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond , 2011, COLT.
[23] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[24] Jia Yuan Yu,et al. Lipschitz Bandits without the Lipschitz Constant , 2011, ALT.
[25] Yin Tat Lee,et al. Kernel-based methods for bandit convex optimization , 2016, STOC.
[26] Eli Upfal,et al. Multi-Armed Bandits in Metric Spaces ∗ , 2008 .
[27] Vianney Perchet,et al. Highly-Smooth Zero-th Order Online Optimization , 2016, COLT.
[28] Stanislav Minsker,et al. Estimation of Extreme Values and Associated Level Sets of a Regression Function via Selective Sampling , 2013, COLT.
[29] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[30] Sham M. Kakade,et al. Stochastic Convex Optimization with Bandit Feedback , 2011, SIAM J. Optim..
[31] Robert D. Nowak,et al. Query Complexity of Derivative-Free Optimization , 2012, NIPS.
[32] Elad Hazan,et al. Bandit Convex Optimization: Towards Tight Bounds , 2014, NIPS.
[33] Alexandra Carpentier,et al. Adaptivity to Smoothness in X-armed bandits , 2018, COLT.
[34] Cong Shen,et al. Cost-Aware Cascading Bandits , 2018, IEEE Transactions on Signal Processing.