Real-Time Reinforcement Learning
暂无分享,去创建一个
[1] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[2] Thomas J. Walsh,et al. Learning and planning in environments with delayed feedback , 2009, Autonomous Agents and Multi-Agent Systems.
[3] R. Bellman. A Markovian Decision Process , 1957 .
[4] David Silver,et al. Learning values across many orders of magnitude , 2016, NIPS.
[5] Roland Siegwart,et al. Control of a Quadrotor With Reinforcement Learning , 2017, IEEE Robotics and Automation Letters.
[6] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[7] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[8] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[9] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[10] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[11] Yann Ollivier,et al. Making Deep Q-learning methods robust to time discretization , 2019, ICML.
[12] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[13] Joshua B. Tenenbaum,et al. At Human Speed: Deep Reinforcement Learning with Action Delay , 2018, ArXiv.
[14] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[15] Joonho Lee,et al. Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.
[16] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[17] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[18] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[19] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[20] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[21] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[22] Patrick M. Pilarski,et al. Reactive Reinforcement Learning in Asynchronous Environments , 2018, Front. Robot. AI.