An Efficient Approach to Model-Based Hierarchical Reinforcement Learning
暂无分享,去创建一个
[1] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[2] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[3] Marc Toussaint,et al. Hierarchical Monte-Carlo Planning , 2015, AAAI.
[4] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[5] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[6] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[7] Sungyoung Lee,et al. Approximate planning for bayesian hierarchical reinforcement learning , 2014, Applied Intelligence.
[8] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.
[9] Shie Mannor,et al. Time-Regularized Interrupting Options (TRIO) , 2014, ICML.
[10] Peter Stone,et al. Hierarchical model-based reinforcement learning: R-max + MAXQ , 2008, ICML '08.
[11] Andrew G. Barto,et al. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.
[12] Feng Cao,et al. Bayesian Hierarchical Reinforcement Learning , 2012, NIPS.
[13] Alexander L. Strehl,et al. Probably Approximately Correct (PAC) Exploration in Reinforcement Learning , 2008, ISAIM.
[14] Lihong Li,et al. PAC-inspired Option Discovery in Lifelong Reinforcement Learning , 2014, ICML.
[15] David Silver,et al. Compositional Planning Using Optimal Option Models , 2012, ICML.
[16] Wolfgang Ertel,et al. Monte carlo bayesian hierarchical reinforcement learning , 2014, AAMAS.
[17] Nahum Shimkin,et al. Unified Inter and Intra Options Learning Using Policy Gradient Methods , 2011, EWRL.
[18] Stuart J. Russell,et al. Markovian State and Action Abstractions for MDPs via Hierarchical MCTS , 2016, IJCAI.
[19] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[20] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.
[21] Olivier Michel,et al. Cyberbotics Ltd. Webots™: Professional Mobile Robot Simulation , 2004, ArXiv.