Efficient learning by implicit exploration in bandit problems with side observations
暂无分享,去创建一个
Rémi Munos | Michal Valko | Gergely Neu | Tomás Kocák | R. Munos | Michal Valko | Gergely Neu | Tomás Kocák
[1] L. Gyorfi,et al. Sequential Prediction of Unbounded Stationary Time Series , 2007, IEEE Transactions on Information Theory.
[2] Nicolò Cesa-Bianchi,et al. Combinatorial Bandits , 2012, COLT.
[3] Gergely Neu,et al. An Efficient Algorithm for Learning with Semi-bandit Feedback , 2013, ALT.
[4] David Haussler,et al. How to use expert advice , 1993, STOC.
[5] Wei Chen,et al. Combinatorial Multi-Armed Bandit: General Framework and Applications , 2013, ICML.
[6] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[7] Gábor Lugosi,et al. Regret in Online Combinatorial Optimization , 2012, Math. Oper. Res..
[8] Vladimir Vovk,et al. Aggregating strategies , 1990, COLT '90.
[9] Shie Mannor,et al. From Bandits to Experts: On the Value of Side-Observations , 2011, NIPS.
[10] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[11] Santosh S. Vempala,et al. Efficient algorithms for online decision problems , 2005, J. Comput. Syst. Sci..
[12] Philip Wolfe,et al. Contributions to the theory of games , 1953 .
[13] James Hannan,et al. 4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .
[14] Claudio Gentile,et al. Adaptive and Self-Confident On-Line Learning Algorithms , 2000, J. Comput. Syst. Sci..
[15] Noga Alon,et al. From Bandits to Experts: A Tale of Domination and Independence , 2013, NIPS.
[16] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[17] Marcus Hutter,et al. Prediction with Expert Advice by Following the Perturbed Leader for General Weights , 2004, ALT.