Improving Action Selection in MDP's via Knowledge Transfer
暂无分享,去创建一个
[1] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[2] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[3] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[4] Ashwin Ram,et al. Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..
[5] Milos Hauskrecht,et al. Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.
[6] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[7] Alexander Zelinsky,et al. Q-Learning in Continuous State and Action Spaces , 1999, Australian Joint Conference on Artificial Intelligence.
[8] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[9] Craig Boutilier,et al. Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.
[10] Peter Stone,et al. Scaling Reinforcement Learning toward RoboCup Soccer , 2001, ICML.
[11] Carlos Guestrin,et al. Generalizing plans to new environments in relational MDPs , 2003, IJCAI 2003.
[12] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.