Hierarchical Reinforcement Learning with Parameters
暂无分享,去创建一个
[1] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[2] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[3] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[4] Thomas G. Dietterich,et al. Automatic discovery and transfer of MAXQ hierarchies , 2008, ICML '08.
[5] Pravesh Ranchod,et al. Reinforcement Learning with Parameterized Actions , 2015, AAAI.
[6] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[7] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[8] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[9] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[10] Christian L'eonard,et al. Transport Inequalities. A Survey , 2010, 1003.3852.
[11] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[12] Yuval Tassa,et al. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.
[13] Emanuel Todorov,et al. From task parameters to motor synergies: A hierarchical framework for approximately optimal control of redundant manipulators , 2005, J. Field Robotics.
[14] F. Hollander. Probability Theory : The Coupling Method , 2012 .
[15] Jan Peters,et al. Probabilistic inference for determining options in reinforcement learning , 2016, Machine Learning.
[16] Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.
[17] Chee-Meng Chew,et al. Virtual Model Control: An Intuitive Approach for Bipedal Locomotion , 2001, Int. J. Robotics Res..
[18] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[19] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[20] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[21] Bruno Castro da Silva,et al. Active Learning of Parameterized Skills , 2014, ICML.
[22] I. Csiszár. $I$-Divergence Geometry of Probability Distributions and Minimization Problems , 1975 .
[23] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[24] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[25] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[26] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[27] John Schulman,et al. Optimizing Expectations: From Deep Reinforcement Learning to Stochastic Computation Graphs , 2016 .
[28] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[29] Sergey Levine,et al. Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic , 2016, ICLR.
[30] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.