Adaptive Skills Adaptive Partitions (ASAP)
暂无分享,去创建一个
[1] Eric Eaton,et al. Online Multi-Task Learning for Policy Gradient Methods , 2014, ICML.
[2] Shie Mannor,et al. Time-regularized interrupting options , 2014, ICML 2014.
[3] Doina Precup,et al. Multi-time Models for Temporally Abstract Planning , 1997, NIPS.
[4] Shie Mannor,et al. Scaling Up Approximate Value Iteration with Options: Better Policies with Fewer Iterations , 2014, ICML.
[5] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[6] Peter Stone,et al. Deep Reinforcement Learning in Parameterized Action Space , 2015, ICLR.
[7] F. Richard Yu,et al. Distributed Unit Commitment Scheduling in the Future Smart Grid with Intermittent Renewable Energy Resources and Stochastic Power Demands , 2014 .
[8] Pravesh Ranchod,et al. Reinforcement Learning with Parameterized Actions , 2015, AAAI.
[9] Doina Precup,et al. Theoretical Results on Reinforcement Learning with Temporally Abstract Options , 1998, ECML.
[10] Nicholas Roy,et al. Efficient Planning under Uncertainty with Macro-actions , 2014, J. Artif. Intell. Res..
[11] Shie Mannor,et al. Learning When to Switch between Skills in a High Dimensional Domain , 2015, AAAI Workshop: Learning for General Competency in Video Games.
[12] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[13] Sebastian Thrun,et al. Lifelong robot learning , 1993, Robotics Auton. Syst..
[14] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .
[15] Milos Hauskrecht,et al. Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.
[16] Tomoharu Nakashima,et al. HELIOS Base: An Open Source Package for the RoboCup Soccer 2D Simulation , 2013, RoboCup.
[17] Lihong Li,et al. PAC-inspired Option Discovery in Lifelong Reinforcement Learning , 2014, ICML.
[18] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[19] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[20] Sergey Levine,et al. One-shot learning of manipulation skills with online dynamics adaptation and neural network priors , 2015, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[21] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[22] Doina Precup,et al. Optimal policy switching algorithms for reinforcement learning , 2010, AAMAS.
[23] Andrew G. Barto,et al. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.
[24] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[25] Eric Eaton,et al. ELLA: An Efficient Lifelong Learning Algorithm , 2013, ICML.
[26] Eric Eaton,et al. Safe Policy Search for Lifelong Reinforcement Learning with Sublinear Regret , 2015, ICML.
[27] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[28] Bruno Castro da Silva,et al. Learning Parameterized Skills , 2012, ICML.
[29] Feng Wu,et al. Online planning for large MDPs with MAXQ decomposition , 2012, AAMAS.
[30] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[31] David Silver,et al. Compositional Planning Using Optimal Option Models , 2012, ICML.