Effective Policy Adjustment via Meta-Learning for Complex Manipulation Tasks
暂无分享,去创建一个
[1] Jianfeng Gao,et al. Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.
[2] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[3] Marcin Andrychowicz,et al. Learning to learn by gradient descent by gradient descent , 2016, NIPS.
[4] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[5] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[6] Pieter Abbeel,et al. Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments , 2017, ICLR.
[7] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[8] Yoshua Bengio,et al. On the Optimization of a Synaptic Learning Rule , 2007 .
[9] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[10] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[11] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[12] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[13] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[14] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[15] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.