Efficient Imitation Learning with Local Trajectory Optimization
暂无分享,去创建一个
Navdeep Jaitly | Azalia Mirhoseini | Ebrahim M. Songhori | Amir Yazdanbakhsh | Jialin Song | Joe Wenjie Jiang | Anna Goldie | Ebrahim Songhori | Azalia Mirhoseini | Anna Goldie | J. Jiang | A. Yazdanbakhsh | Jialin Song | N. Jaitly
[1] Pieter Abbeel,et al. Third-Person Imitation Learning , 2017, ICLR.
[2] Quoc V. Le,et al. A Hierarchical Model for Device Placement , 2018, ICLR.
[3] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[4] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[5] Samy Bengio,et al. Device Placement Optimization with Reinforcement Learning , 2017, ICML.
[6] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[7] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[8] John Langford,et al. Search-based structured prediction , 2009, Machine Learning.
[9] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[10] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[11] David Barber,et al. Thinking Fast and Slow with Deep Learning and Tree Search , 2017, NIPS.
[12] Ilya Kostrikov,et al. Imitation Learning via Off-Policy Distribution Matching , 2019, ICLR.
[13] Dean Pomerleau,et al. Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.
[14] Anca D. Dragan,et al. DART: Noise Injection for Robust Imitation Learning , 2017, CoRL.
[15] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[16] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[17] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[18] Byron Boots,et al. Dual Policy Iteration , 2018, NeurIPS.
[19] Sergey Levine,et al. End-to-End Robotic Reinforcement Learning without Reward Engineering , 2019, Robotics: Science and Systems.
[20] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[21] Azalia Mirhoseini,et al. GDP: Generalized Device Placement for Dataflow Graphs , 2019, ArXiv.
[22] J. Andrew Bagnell,et al. Efficient Reductions for Imitation Learning , 2010, AISTATS.
[23] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[24] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[25] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.