暂无分享,去创建一个
Yuval Tassa | Martin A. Riedmiller | Yazhe Li | David Budden | Abbas Abdolmaleki | Tom Erez | Timothy P. Lillicrap | Alistair Muldal | Yotam Doron | Josh Merel | Diego de Las Casas | Andrew Lefrancq | D. Budden | T. Lillicrap | T. Erez | Yuval Tassa | J. Merel | A. Abdolmaleki | Alistair Muldal | Yazhe Li | Yotam Doron | Andrew Lefrancq | Tom Erez
[1] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[2] Jing Peng,et al. Function Optimization using Connectionist Reinforcement Learning Algorithms , 1991 .
[3] Karl Sims,et al. Evolving virtual creatures , 1994, SIGGRAPH.
[4] Mark W. Spong,et al. The swing up control problem for the Acrobot , 1995 .
[5] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .
[6] Rémi Coulom,et al. Reinforcement Learning Using Neural Networks, with Applications to Motor Control. (Apprentissage par renforcement utilisant des réseaux de neurones, avec des applications au contrôle moteur) , 2002 .
[7] Pawel Wawrzynski,et al. Real-time reinforcement learning by sequential Actor-Critics and experience replay , 2009, Neural Networks.
[8] Yuval Tassa,et al. Stochastic Complementarity for Local Control of Discontinuous Dynamics , 2010, Robotics: Science and Systems.
[9] Yuval Tassa,et al. Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[10] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[11] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[13] Yuval Tassa,et al. Simulation tools for model-based robotics: Comparison of Bullet, Havok, MuJoCo, ODE and PhysX , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[14] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[15] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[16] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[17] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[18] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[19] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[20] Peter Henderson,et al. Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control , 2017, ArXiv.
[21] Yuval Tassa,et al. Learning human behaviors from motion capture by adversarial imitation , 2017, ArXiv.
[22] Yuval Tassa,et al. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.
[23] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.