High-Probability Regret Bounds for Bandit Online Linear Optimization
暂无分享,去创建一个
Thomas P. Hayes | Ambuj Tewari | Peter L. Bartlett | Sham M. Kakade | Varsha Dani | Alexander Rakhlin | S. Kakade | P. Bartlett | Ambuj Tewari | A. Rakhlin | Varsha Dani
[1] D. Freedman. On Tail Probabilities for Martingales , 1975 .
[2] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[3] Manfred K. Warmuth,et al. Path Kernels and Multiplicative Updates , 2002, J. Mach. Learn. Res..
[4] Santosh S. Vempala,et al. Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..
[5] Baruch Awerbuch,et al. Adaptive routing with end-to-end feedback: distributed learning and geometric approaches , 2004, STOC '04.
[6] Avrim Blum,et al. Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary , 2004, COLT.
[7] Robert D. Kleinberg,et al. Online decision problems with large strategy sets , 2005 .
[8] Baruch Awerbuch,et al. Provably competitive adaptive routing , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..
[9] Thomas P. Hayes,et al. Robbing the bandit: less regret in online geometric optimization against an adaptive adversary , 2006, SODA '06.
[10] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[11] Tamás Linder,et al. The On-Line Shortest Path Problem Under Partial Monitoring , 2007, J. Mach. Learn. Res..
[12] Thomas P. Hayes,et al. The Price of Bandit Information for Online Optimization , 2007, NIPS.
[13] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[14] Claudio Gentile,et al. Improved Risk Tail Bounds for On-Line Algorithms , 2005, IEEE Transactions on Information Theory.
[15] Elad Hazan,et al. Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization , 2008, COLT.