A simple multi-armed bandit algorithm with optimal variation-bounded regret
暂无分享,去创建一个
[1] Yishay Mansour,et al. Improved second-order bounds for prediction with expert advice , 2006, Machine Learning.
[2] Elad Hazan,et al. Better Algorithms for Benign Bandits , 2009, J. Mach. Learn. Res..
[3] Elad Hazan,et al. On Stochastic and Worst-case Models for Investing , 2009, NIPS.
[4] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[5] Elad Hazan,et al. Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization , 2008, COLT.
[6] Elad Hazan,et al. Extracting certainty from uncertainty: regret bounded by variation in costs , 2008, Machine Learning.