Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning
暂无分享,去创建一个
Sergey Levine | Karol Hausman | Corey Lynch | Abhishek Gupta | Vikash Kumar | S. Levine | Abhishek Gupta | Vikash Kumar | Karol Hausman | Corey Lynch
[1] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[2] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[3] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[4] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[5] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[6] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[7] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[8] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[9] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[10] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[11] Vikash Kumar,et al. MuJoCo HAPTIX: A virtual reality system for hand manipulation , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).
[12] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[13] Sergey Levine,et al. Optimal control with learned local models: Application to dexterous manipulation , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[14] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[15] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[16] Ion Stoica,et al. Multi-Level Discovery of Deep Options , 2017, ArXiv.
[17] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[18] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[19] Gaurav S. Sukhatme,et al. Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets , 2017, NIPS.
[20] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[21] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[22] Sergey Levine,et al. Visual Reinforcement Learning with Imagined Goals , 2018, NeurIPS.
[23] Sergey Levine,et al. Temporal Difference Models: Model-Free Deep RL for Model-Based Control , 2018, ICLR.
[24] Joelle Pineau,et al. OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning , 2017, AAAI.
[25] Sergey Levine,et al. Divide-and-Conquer Reinforcement Learning , 2017, ICLR.
[26] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[27] Nando de Freitas,et al. Reinforcement and Imitation Learning for Diverse Visuomotor Skills , 2018, Robotics: Science and Systems.
[28] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.
[29] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.
[30] Nan Jiang,et al. Hierarchical Imitation and Reinforcement Learning , 2018, ICML.
[31] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[32] Mohit Sharma,et al. Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information , 2018, ICLR.
[33] Brijen Thananjeyan,et al. SWIRL: A sequential windowed inverse reinforcement learning algorithm for robot tasks with delayed rewards , 2018, Int. J. Robotics Res..
[34] Learning Latent Plans from Play , 2019, CoRL.
[35] Pushmeet Kohli,et al. CompILE: Compositional Imitation Learning and Execution , 2018, ICML.