Reinforcement Learning with Immediate Rewards and Linear Hypotheses
暂无分享,去创建一个
[1] P. Anandan,et al. Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.
[2] Leslie G. Valiant,et al. On the learnability of Boolean formulae , 1987, STOC.
[3] N. Littlestone. Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).
[4] Kumpati S. Narendra,et al. Learning automata - an introduction , 1989 .
[5] David Haussler,et al. How to use expert advice , 1993, STOC.
[6] D. Haussler,et al. Rigorous learning curve bounds from statistical mechanics , 1994, COLT '94.
[7] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.
[8] Philip M. Long,et al. Worst-case quadratic loss bounds for prediction using linear functions and gradient descent , 1996, IEEE Trans. Neural Networks.
[9] Philip M. Long. On-line evaluation and prediction using linear functions , 1997, COLT '97.
[10] Claude-Nicolas Fiechter,et al. Design and analysis of efficient reinforcement learning algorithms , 1997 .
[11] Krishnan Rajagopalan,et al. Goal-Oriented Multimedia Dialogue with Variable Initiative , 1997, ISMIS.
[12] Naoki Abe,et al. Learning to Optimally Schedule Internet Banner Advertisements , 1999, ICML.
[13] Philip M. Long,et al. Associative Reinforcement Learning using Linear Probabilistic Concepts , 1999, ICML.
[14] Chamy Allenberg,et al. Individual sequence prediction—upper bounds and application for complexity , 1999, COLT '99.
[15] Peter Auer,et al. Using upper confidence bounds for online learning , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.
[16] Philip M. Long,et al. Apple Tasting , 2000, Inf. Comput..
[17] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[18] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[19] Richard S. Sutton,et al. Associative search network: A reinforcement learning associative memory , 1981, Biological Cybernetics.
[20] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[21] R. Schapire,et al. Toward efficient agnostic learning , 1992, COLT '92.
[22] Leslie Pack Kaelbling,et al. Associative Reinforcement Learning: Functions in k-DNF , 1994, Machine Learning.
[23] L. Kaelbling. Associative reinforcement learning: A generate and test algorithm , 2004, Machine Learning.
[24] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.