Distributed Value Functions
暂无分享,去创建一个
Andrew W. Moore | Weng-Keen Wong | Jeff G. Schneider | Martin A. Riedmiller | A. Moore | J. Schneider | W. Wong | Weng-Keen Wong
[1] T. Michael Knasel,et al. Robotics and autonomous systems , 1988, Robotics Auton. Syst..
[2] C. Watkins. Learning from delayed rewards , 1989 .
[3] Michael L. Littman,et al. Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.
[4] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[5] Minoru Asada,et al. Coordination of multiple behaviors acquired by a vision-based reinforcement learning , 1994, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'94).
[6] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[7] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[8] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[9] Gerhard Weiß,et al. Distributed reinforcement learning , 1995, Robotics Auton. Syst..
[10] Dit-Yan Yeung,et al. Predictive Q-Routing: A Memory-based Reinforcement Learning Approach to Adaptive Traffic Control , 1995, NIPS.
[11] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[12] Jeff G. Schneider,et al. Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning , 1996, NIPS.
[13] Devika Subramanian,et al. Ants and Reinforcement Learning: A Case Study in Routing in Dynamic Networks , 1997, IJCAI.
[14] Satinder P. Singh,et al. How to Dynamically Merge Markov Decision Processes , 1997, NIPS.
[15] Andrew W. Moore,et al. Value Function Based Production Scheduling , 1998, ICML.
[16] E. B. Baum,et al. Manifesto for an evolutionary economics of intelligence , 1998 .