Approximating Action-Value Functions: Addressing Issues of Dynamic Range
暂无分享,去创建一个
[1] Leemon C. Baird,et al. Residual advantage learning applied to a differential game , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).
[2] Mark Harmon. Multi-player residual advantage learning with general function , 1996 .
[3] Marios M. Polycarpou,et al. An analytical framework for local feedforward networks , 1998, IEEE Trans. Neural Networks.
[4] A. Harry Klopf,et al. Reinforcement Learning: An Alternative Approach to Machine Intelligence , 1996 .
[5] Marios M. Polycarpou,et al. An analytical framework for local feedforward networks , 1996, Proceedings of the 1996 IEEE International Symposium on Intelligent Control.
[6] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[7] A. Harry Klopf,et al. Advantage Updating Applied to a Differrential Game , 1994, NIPS.
[8] A. Harry Klopf,et al. Reinforcement Learning Applied to a Differential Game , 1995, Adapt. Behav..
[9] James S. Albus,et al. Brains, behavior, and robotics , 1981 .
[10] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..