暂无分享,去创建一个
[1] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[2] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[3] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[4] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[5] Marc Peter Deisenroth,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.
[6] Ken Goldberg,et al. Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation , 2017, ICRA.
[7] Lei Han,et al. Curriculum-guided Hindsight Experience Replay , 2019, NeurIPS.
[8] Richard Socher,et al. Competitive Experience Replay , 2019, ICLR.
[9] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[10] Pieter Abbeel,et al. Goal-conditioned Imitation Learning , 2019, NeurIPS.
[11] Marcin Andrychowicz,et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.
[12] Peter Stone,et al. Behavioral Cloning from Observation , 2018, IJCAI.
[13] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[14] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[15] Yurong Liu,et al. A survey of deep neural network architectures and their applications , 2017, Neurocomputing.
[16] Satinder Singh,et al. Self-Imitation Learning , 2018, ICML.
[17] Sergey Levine,et al. One-Shot Visual Imitation Learning via Meta-Learning , 2017, CoRL.
[18] Nando de Freitas,et al. Robust Imitation of Diverse Behaviors , 2017, NIPS.
[19] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[20] Masashi Sugiyama,et al. Imitation Learning from Imperfect Demonstration , 2019, ICML.
[21] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[22] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[23] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[24] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[25] Andrew J. Davison,et al. Task-Embedded Control Networks for Few-Shot Imitation Learning , 2018, CoRL.
[26] Xin Zhang,et al. End to End Learning for Self-Driving Cars , 2016, ArXiv.
[27] Volker Tresp,et al. Energy-Based Hindsight Experience Prioritization , 2018, CoRL.
[28] Pieter Abbeel,et al. Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.
[29] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[30] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[31] Yang Gao,et al. Reinforcement Learning from Imperfect Demonstrations , 2018, ICLR.
[32] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[33] Boqing Gong,et al. DHER: Hindsight Experience Replay for Dynamic Goals , 2018, ICLR.
[34] Mohamed Medhat Gaber,et al. Imitation Learning , 2017, ACM Comput. Surv..
[35] Samy Bengio,et al. Self-Imitation Learning via Trajectory-Conditioned Policy for Hard-Exploration Tasks , 2019 .
[36] Fuchun Sun,et al. Survey of imitation learning for robotic manipulation , 2019, International Journal of Intelligent Robotics and Applications.
[37] Yang Gao,et al. End-to-End Learning of Driving Models from Large-Scale Video Datasets , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[39] Filipe Wall Mutz,et al. Hindsight policy gradients , 2017, ICLR.
[40] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[41] Tao Lu,et al. Hindsight Generative Adversarial Imitation Learning , 2019, ArXiv.
[42] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.
[43] Yunhao Tang. Self-Imitation Learning via Generalized Lower Bound Q-learning , 2020, NeurIPS.
[44] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[45] Sae-Young Chung,et al. Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update , 2018, NeurIPS.
[46] Qiang Liu,et al. Learning Self-Imitating Diverse Policies , 2018, ICLR.
[47] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).