暂无分享,去创建一个
John Langford | Lihong Li | Robert E. Schapire | Alina Beygelzimer | Lev Reyzin | R. Schapire | J. Langford | Lihong Li | L. Reyzin | A. Beygelzimer
[1] Matthew J. Streeter,et al. Tighter Bounds for Multi-Armed Bandits with Expert Advice , 2009, COLT.
[2] Jacob D. Abernethy,et al. An Efficient Bandit Algorithm for sqrt(T) Regret in Online Multiclass Prediction? , 2009, COLT.
[3] Deepak Agarwal,et al. Online Models for Content Optimization , 2008, NIPS.
[4] Ambuj Tewari,et al. Efficient bandit algorithms for online multiclass prediction , 2008, ICML '08.
[5] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[6] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[7] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[8] Robert E. Schapire,et al. Predicting Nearly As Well As the Best Pruning of a Decision Tree , 1995, COLT '95.
[9] Manfred K. Warmuth,et al. The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.
[10] D. Freedman. On Tail Probabilities for Martingales , 1975 .
[11] H. Robbins. Some aspects of the sequential design of experiments , 1952 .