Reinforcement Learning With High-Dimensional, Continuous Actions
暂无分享,去创建一个
[1] Paul J. Werbos,et al. Neural networks for control and system identification , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.
[2] Vijaykumar Gullapalli,et al. A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.
[3] Gerald Tesauro,et al. Neurogammon: a neural-network backgammon program , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[4] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[5] Peter J. Millington,et al. Associative reinforcement learning for optimal control , 1991 .
[6] V. Gullapalli,et al. Associative reinforcement learning of real-valued functions , 1991, Conference Proceedings 1991 IEEE International Conference on Systems, Man, and Cybernetics.
[7] L. C. Baird. Function minimization for dynamic programming using connectionist networks , 1992, [Proceedings] 1992 IEEE International Conference on Systems, Man, and Cybernetics.
[8] James S. Morgan,et al. A Hierarchical Network of Control Systems that Learn: Modeling Nervous System Function During Classical and Instrumental Conditioning , 1993, Adapt. Behav..
[9] A. Harry Klopf,et al. A Hierarchical Network of Provably Optimal Learning Control Systems: Extensions of the Associative Control Process (ACP) Network , 1993, Adapt. Behav..
[10] A. Harry Klopf,et al. Extensions of the associative control process (ACP) network: hierarchies and provable optimality , 1993 .
[11] A. Harry Klopf,et al. Modeling nervous system function with a hierarchical network of control systems that learn , 1993 .
[12] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..