暂无分享,去创建一个
[1] Aleksandrs Slivkins,et al. Contextual Bandits with Similarity Information , 2009, COLT.
[2] F. Barahona. On the computational complexity of Ising spin glass models , 1982 .
[3] J. Abernethy,et al. An Efficient Bandit Algorithm for √ T-Regret in Online Multiclass Prediction ? , 2009 .
[4] John Langford,et al. Contextual Bandit Algorithms with Supervised Learning Guarantees , 2010, AISTATS.
[5] Martin Pál,et al. Contextual Multi-Armed Bandits , 2010, AISTATS.
[6] Claudio Gentile,et al. Robust bounds for classification via selective sampling , 2009, ICML '09.
[7] Michael L. Littman,et al. Online Linear Regression and Its Application to Model-Based Reinforcement Learning , 2007, NIPS.
[8] Philip M. Long,et al. Reinforcement Learning with Immediate Rewards and Linear Hypotheses , 2003, Algorithmica.
[9] J. Langford,et al. The Epoch-Greedy algorithm for contextual multi-armed bandits , 2007, NIPS 2007.
[10] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[11] Jacob D. Abernethy,et al. An Efficient Bandit Algorithm for sqrt(T) Regret in Online Multiclass Prediction? , 2009, COLT.
[12] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[13] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[14] Thomas J. Walsh,et al. Knows what it knows: a framework for self-aware learning , 2008, ICML '08.