Overcoming Exploration in Reinforcement Learning with Demonstrations
暂无分享,去创建一个
Marcin Andrychowicz | Wojciech Zaremba | Pieter Abbeel | Bob McGrew | Ashvin Nair | P. Abbeel | Marcin Andrychowicz | Wojciech Zaremba | Ashvin Nair | Bob McGrew
[1] Terry Winograd,et al. Understanding natural language , 1974 .
[2] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[3] Lydia E. Kavraki,et al. Probabilistic roadmaps for path planning in high-dimensional configuration spaces , 1996, IEEE Trans. Robotics Autom..
[4] B. Faverjon,et al. Probabilistic Roadmaps for Path Planning in High-Dimensional Con(cid:12)guration Spaces , 1996 .
[5] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.
[6] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[7] Jun Morimoto,et al. Learning from demonstration and adaptation of biped locomotion , 2004, Robotics Auton. Syst..
[8] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[9] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[10] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[11] Stefan Schaal,et al. Learning locomotion over rough terrain using terrain templates , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[12] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[13] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[14] Carl E. Rasmussen,et al. Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning , 2011, Robotics: Science and Systems.
[15] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[16] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[17] Leslie Pack Kaelbling,et al. Hierarchical task and motion planning in the now , 2011, 2011 IEEE International Conference on Robotics and Automation.
[18] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[19] Joelle Pineau,et al. Learning from Limited Demonstrations , 2013, NIPS.
[20] Pieter Abbeel,et al. Combined task and motion planning through an extensible planner-independent interface layer , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[21] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[22] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[23] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[24] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[25] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[26] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[27] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[28] Xin Zhang,et al. End to End Learning for Self-Driving Cars , 2016, ArXiv.
[29] Jürgen Schmidhuber,et al. A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots , 2016, IEEE Robotics and Automation Letters.
[30] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[31] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[32] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[33] Abhinav Gupta,et al. Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[34] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[35] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[36] Marcin Andrychowicz,et al. One-Shot Imitation Learning , 2017, NIPS.
[37] Pieter Abbeel,et al. Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.
[38] Tom Schaul,et al. Learning from Demonstrations for Real World Reinforcement Learning , 2017, ArXiv.
[39] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[40] Yuval Tassa,et al. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.
[41] Sergey Levine,et al. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..
[42] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.