论文信息 - AIMS CDT Project Report : Towards One-Shot Learning From Demonstration via Reinforcement Learning

AIMS CDT Project Report : Towards One-Shot Learning From Demonstration via Reinforcement Learning

We explore meta-learning algorithms and architectures for use in one-shot learning from demonstration via reinforcement learning. We provide evidence that REPTILE does not work effectively at meta-learning in reinforcement learning environments and present preliminary findings on the effectiveness of GRUs at ‘fast adaptation’ to tasks in reinforcement learning environments.

[1] Xi Chen,et al. Learning From Demonstration in the Wild , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[2] Joshua Achiam,et al. On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[3] Pieter Abbeel,et al. Some Considerations on Learning to Explore via Meta-Reinforcement Learning , 2018, ICLR 2018.

[4] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.

[5] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.

[6] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[7] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.

[8] Michael I. Jordan,et al. Trust Region Policy Optimization , 2015, ICML.

[9] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[10] A. Thomaz,et al. Robot Learning from Human Teachers , 2014, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[11] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[13] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.