Q-learning for Robots
暂无分享,去创建一个
[1] Claude Touzet,et al. Dynamic Update of the Reinforcement Function During Learning , 1999, Connect. Sci..
[2] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[3] Claude F. Touzet,et al. Neural reinforcement learning for behaviour synthesis , 1997, Robotics Auton. Syst..
[4] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[5] Andrew McCallum,et al. Instance-Based State Identification for Reinforcement Learning , 1994, NIPS.
[6] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[7] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[8] David W. Aha,et al. Lazy Learning , 1997, Springer Netherlands.
[9] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[10] Steven Salzberg,et al. A Teaching Strategy for Memory-Based Control , 1997, Artificial Intelligence Review.
[11] Claude F. Touzet. Programming robots with associative memories , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).
[12] David H. Ackley,et al. Interactions between learning and evolution , 1991 .
[13] P. Dayan,et al. TD(λ) converges with probability 1 , 2004, Machine Learning.
[14] Marco Colombetti,et al. Robot Shaping: An Experiment in Behavior Engineering , 1997 .
[15] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .