暂无分享,去创建一个
Jiayu Zhou | Bo Dai | Kaixiang Lin | Zhuangdi Zhu | Jiayu Zhou | Bo Dai | Zhuangdi Zhu | Kaixiang Lin
[1] Masashi Sugiyama,et al. Imitation Learning from Imperfect Demonstration , 2019, ICML.
[2] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[3] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[4] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[5] Sergey Levine,et al. Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction , 2019, NeurIPS.
[6] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[7] Stefan Schaal,et al. Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.
[8] Mikael Henaff,et al. Disagreement-Regularized Imitation Learning , 2020, ICLR.
[9] Anind K. Dey,et al. Modeling Interaction via the Principle of Maximum Causal Entropy , 2010, ICML.
[10] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[11] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[12] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[13] Joelle Pineau,et al. Learning from Limited Demonstrations , 2013, NIPS.
[14] Sergey Levine,et al. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.
[15] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[16] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[17] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[18] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[19] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[20] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[21] Byron Boots,et al. Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction , 2017, ICML.
[22] Jiashi Feng,et al. Policy Optimization with Demonstrations , 2018, ICML.
[23] Tetsuya Yohira,et al. Sample Efficient Imitation Learning for Continuous Control , 2018, ICLR.
[24] Ilya Kostrikov,et al. Imitation Learning via Off-Policy Distribution Matching , 2019, ICLR.
[25] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[26] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[27] Yiannis Demiris,et al. Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation , 2019, ICML.
[28] Sebastian Nowozin,et al. f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.
[29] Robert E. Schapire,et al. A Game-Theoretic Approach to Apprenticeship Learning , 2007, NIPS.
[30] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[31] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[32] Byron Boots,et al. Truncated Horizon Policy Search: Combining Reinforcement Learning & Imitation Learning , 2018, ICLR.