Online Learning with Switching Costs and Other Adaptive Adversaries
暂无分享,去创建一个
Nicolò Cesa-Bianchi | Ohad Shamir | Ofer Dekel | Nicolò Cesa-Bianchi | N. Cesa-Bianchi | O. Dekel | O. Shamir | Ohad Shamir
[1] D. Teneketzis,et al. Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost , 1988 .
[2] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[3] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[4] Allan Borodin,et al. Online computation and competitive analysis , 1998 .
[5] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[6] Neri Merhav,et al. On sequential strategies for loss functions with memory , 2002, IEEE Trans. Inf. Theory.
[7] Santosh S. Vempala,et al. Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..
[8] DE Economist. A SURVEY ON THE BANDIT PROBLEM WITH SWITCHING COSTS , 2004 .
[9] Avrim Blum,et al. Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary , 2004, COLT.
[10] Chris Mesterharm,et al. On-line Learning with Delayed Label Feedback , 2005, ALT.
[11] Thomas P. Hayes,et al. Robbing the bandit: less regret in online geometric optimization against an adaptive adversary , 2006, SODA '06.
[12] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[13] Yishay Mansour,et al. Improved second-order bounds for prediction with expert advice , 2006, Machine Learning.
[14] Jacob D. Abernethy,et al. Beating the adaptive bandit with high probability , 2009, 2009 Information Theory and Applications Workshop.
[15] Ronald Ortner,et al. Online Regret Bounds for Markov Decision Processes with Deterministic Transitions , 2008, ALT.
[16] Csaba Szepesvári,et al. –armed Bandits , 2022 .
[17] Rémi Munos,et al. Adaptive Bandits: Towards the best history-dependent strategy , 2011, AISTATS.
[18] Ambuj Tewari,et al. Online Bandit Learning against an Adaptive Adversary: from Regret to Policy Regret , 2012, ICML.
[19] Ohad Shamir,et al. On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization , 2012, COLT.
[20] András György,et al. Near-Optimal Rates for Limited-Delay Universal Lossy Source Coding , 2014, IEEE Transactions on Information Theory.
[21] Claudio Gentile,et al. Regret Minimization for Reserve Prices in Second-Price Auctions , 2013, IEEE Transactions on Information Theory.