Probabilistic policy reuse in a reinforcement learning agent
暂无分享,去创建一个
[1] C. Watkins. Learning from delayed rewards , 1989 .
[2] Sebastian Thrun,et al. Efficient Exploration In Reinforcement Learning , 1992 .
[3] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[4] Doina Precup,et al. Intra-Option Learning about Temporally Abstract Actions , 1998, ICML.
[5] M. Veloso,et al. Bounding the suboptimality of reusing subproblems , 1999, IJCAI 1999.
[6] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[7] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.
[8] James L. Carroll,et al. Fixed vs. Dynamic Sub-Transfer in Reinforcement Learning , 2002, ICMLA.
[9] Manuela Veloso,et al. Tree based hierarchical reinforcement learning , 2002 .
[10] Fernando Fernández,et al. On Determinism Handling While Learning Reduced State Space Representations , 2002, ECAI.
[11] C. Boutilier,et al. Accelerating Reinforcement Learning through Implicit Imitation , 2003, J. Artif. Intell. Res..
[12] Michael G. Madden,et al. Transfer of Experience Between Reinforcement Learning Environments with Progressive Difficulty , 2004, Artificial Intelligence Review.
[13] Peter Stone,et al. Behavior transfer for value-function-based reinforcement learning , 2005, AAMAS '05.
[14] Peter Stone,et al. Value Functions for RL-Based Behavior Transfer: A Comparative Study , 2005, AAAI.
[15] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.
[16] Peter Vamplew,et al. Concurrent Q‐learning: Reinforcement learning for dynamic goals and environments , 2005, Int. J. Intell. Syst..
[17] Manuela Veloso,et al. Exploration and Policy Reuse , 2005 .
[18] Jude W. Shavlik,et al. Giving Advice about Preferred Actions to Reinforcement Learners Via Knowledge-Based Kernel Regression , 2005, AAAI.
[19] Peter Stone,et al. Improving Action Selection in MDP's via Knowledge Transfer , 2005, AAAI.