Experience-efficient learning in associative bandit problems
暂无分享,去创建一个
Chris Mesterharm | Haym Hirsh | Michael L. Littman | Alexander L. Strehl | M. Littman | H. Hirsh | Chris Mesterharm
[1] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .
[2] Claude-Nicolas Fiechter. Expected Mistake Bound Model for On-Line Reinforcement Learning , 1997, ICML.
[3] Leslie G. Valiant,et al. A theory of the learnable , 1984, CACM.
[4] John Langford,et al. Estimating Class Membership Probabilities using Classifier Learners , 2005, AISTATS.
[5] John Langford,et al. Cost-sensitive learning by cost-proportionate example weighting , 2003, Third IEEE International Conference on Data Mining.
[6] Peter Auer,et al. An Improved On-line Algorithm for Learning Linear Evaluation Functions , 2000, COLT.
[7] C. Fiechter. PAC Associative Reinforcement Learning , 1995 .
[8] Robert E. Schapire,et al. Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.
[9] Leslie Pack Kaelbling,et al. Associative Reinforcement Learning: Functions in k-DNF , 1994, Machine Learning.
[10] Philip W. L. Fong. A Quantitative Study of Hypothesis Selection , 1995, ICML.
[11] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[12] Philip M. Long,et al. Reinforcement Learning with Immediate Rewards and Linear Hypotheses , 2003, Algorithmica.
[13] Charles Elkan,et al. The Foundations of Cost-Sensitive Learning , 2001, IJCAI.