Incremental Multi-Step
暂无分享,去创建一个
[1] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[2] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[3] Mark D. Pendrith. On Reinforcement Learning of Control Actions in Noisy and Non-Markovian Domains , 1994 .
[4] J. Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, IEEE International Conference on Neural Networks.
[5] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[6] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.