论文信息 - Adaptivity to Smoothness in X-armed bandits - 字舞流文

Adaptivity to Smoothness in X-armed bandits

We study the stochastic continuum-armed bandit problem from the angle of adaptivity to unknown regularity of the reward function f . We prove that there exists no strategy for the cumulative regret that adapts optimally to the smoothness of f . We show however that such minimax optimal adaptive strategies exist if the learner is given extra-information about f . Finally, we complement our positive results with matching lower bounds.

Alexandra Carpentier | Andrea Locatelli | A. Carpentier | A. Locatelli

[1] Steve Hanneke,et al. Adaptive Rates of Convergence in Active Learning , 2009, COLT.

[2] Pierre C. Bellec,et al. Adaptive confidence sets in shape restricted regression , 2016, Bernoulli.

[3] Alexandra Carpentier,et al. Adaptivity to Noise Parameters in Nonparametric Active Learning , 2017, COLT.

[4] Marc Hoffmann,et al. On adaptive inference and confidence bands , 2011, 1202.5145.

[5] Aurélien Garivier,et al. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..

[6] Haipeng Luo,et al. Corralling a Band of Bandit Algorithms , 2016, COLT.

[7] Stanislav Minsker,et al. Plug-in Approach to Active Learning , 2011, J. Mach. Learn. Res..

[8] A. Tsybakov,et al. Optimal aggregation of classifiers in statistical learning , 2003 .

[9] Rémi Munos,et al. Optimistic Optimization of Deterministic Functions , 2011, NIPS 2011.

[10] Eli Upfal,et al. Multi-Armed Bandits in Metric Spaces ∗ , 2008 .

[11] T. Tony Cai,et al. Adaptive Confidence Balls , 2006 .

[12] Aurélien Garivier,et al. Explore First, Exploit Next: The True Shape of Regret in Bandit Problems , 2016, Math. Oper. Res..

[13] Unimodal Bandits. Unimodal Bandits: Regret Lower Bounds and Optimal Algorithms , 2013 .

[14] Sophie Lambert-Lacroix,et al. On nonparametric confidence set estimation , 2001 .

[15] A. Tsybakov,et al. Fast learning rates for plug-in classifiers , 2007, 0708.2321.

[16] Jia Yuan Yu,et al. Lipschitz Bandits without the Lipschitz Constant , 2011, ALT.

[17] Stanislav Minsker,et al. Estimation of Extreme Values and Associated Level Sets of a Regression Function via Selective Sampling , 2013, COLT.

[18] Robert D. Kleinberg. Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.

[19] Vladimir Koltchinskii,et al. Rademacher Complexities and Bounding the Excess Risk in Active Learning , 2010, J. Mach. Learn. Res..

[20] Alexandre B. Tsybakov,et al. Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[21] Shie Mannor,et al. Unimodal Bandits , 2011, ICML.

[22] Rémi Munos,et al. Pure exploration in finitely-armed and continuous-armed bandits , 2011, Theor. Comput. Sci..

[23] T. Cai,et al. Adaptive confidence intervals for regression functions under shape constraints , 2013, 1305.5673.

[24] Eli Upfal,et al. Bandits and Experts in Metric Spaces , 2013, J. ACM.

[25] Rémi Munos,et al. Stochastic Simultaneous Optimistic Optimization , 2013, ICML.

[26] V. Spokoiny,et al. Optimal pointwise adaptive methods in nonparametric estimation , 1997 .

[27] Peter Auer,et al. Improved Rates for the Stochastic Continuum-Armed Bandit Problem , 2007, COLT.

[28] Vianney Perchet,et al. Bounded regret in stochastic multi-armed bandits , 2013, COLT.

[29] R. Agrawal. The Continuum-Armed Bandit Problem , 1995 .

[30] P. Massart,et al. From Model Selection to Adaptive Estimation , 1997 .

[31] Aleksandrs Slivkins,et al. Multi-armed bandits on implicit metric spaces , 2011, NIPS.

[32] Rémi Munos,et al. Black-box optimization of noisy functions with unknown smoothness , 2015, NIPS.