Training a deep policy gradient-based neural network with asynchronous learners on a simulated robotic problem
暂无分享,去创建一个
[1] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[2] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[3] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[4] Sergey Levine,et al. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..
[5] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[6] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[7] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[8] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[9] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[10] Peter I. Corke,et al. Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control , 2015, ICRA 2015.
[11] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.