Finding Memoryless Probabilistic Relational Policies for Inter-task Reuse
暂无分享,去创建一个
Anna Helena Reali Costa | Valdinei Freire da Silva | Fernando A. Pereira | V. F. Silva | Fernando A. Pereira
[1] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[2] M. van Otterlo. Reinforcement Learning for Relational MDPs , 2004 .
[3] Luc De Raedt,et al. Relational Reinforcement Learning , 2001, Machine Learning.
[4] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[5] Xi-Ren Cao,et al. A Sensitivity View of Markov Decision Processes and Reinforcement Learning , 2003 .
[6] Luc De Raedt,et al. Bellman goes relational , 2004, ICML.
[7] Scott Sanner,et al. Practical solution techniques for first-order MDPs , 2009, Artif. Intell..
[8] Fabio Gagliardi Cozman,et al. Simultaneous Abstract and Concrete Reinforcement Learning , 2011, SARA.
[9] Xi-Ren Cao,et al. Gradient-based policy iteration: an example , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..
[10] Hongsheng Xi,et al. Finding optimal memoryless policies of POMDPs under the expected average reward criterion , 2011, Eur. J. Oper. Res..
[11] Craig Boutilier,et al. Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.
[12] Robert Givan,et al. Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..
[13] Wolfram Burgard,et al. Learning Relational Navigation Policies , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[14] Peter Stone,et al. State Abstraction Discovery from Irrelevant State Variables , 2005, IJCAI.
[15] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[16] Luc De Raedt,et al. Relational Reinforcement Learning , 1998, ILP.
[17] John Loch,et al. Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes , 1998, ICML.
[18] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .