Integrated Learning and Planning Based on Truncating Temporal Differences
暂无分享,去创建一个
[1] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[2] Pawel Cichosz,et al. Truncating Temporal Differences: On the Efficient Implementation of TD(lambda) for Reinforcement Learning , 1994, J. Artif. Intell. Res..
[3] Pawea Cichosz,et al. Truncating Temporal Diierences: on the Eecient Implementation of Td() for Reinforcement Learning , 1995 .
[4] P. Cichosz,et al. Truncated Temporal Diierences with Function Approximation: Successful Examples Using Cmac , 1996 .
[5] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[6] J. Mulawka. Fast and Eecient Reinforcement Learning with Truncated Temporal Diierences , 1995 .
[7] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[8] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[9] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[10] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[11] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[12] Pawel Cichosz,et al. Fast and Efficient Reinforcement Learning with Truncated Temporal Differences , 1995, ICML.
[13] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[14] J. Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, IEEE International Conference on Neural Networks.