Learning a model-free robotic continuous state-action task through contractive Q-network
暂无分享,去创建一个
Khalil Alipour | Bahram Tarvirdizadeh | Alireza Hadi | Mohammadjavad Davari | K. Alipour | Bahram Tarvirdizadeh | Alireza Hadi | Mohammadjavad Davari
[1] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[2] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[3] H. Sebastian Seung,et al. Stochastic policy gradient reinforcement learning on a simple 3D biped , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).
[4] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[5] R. Pfeifer,et al. Self-Organization, Embodiment, and Biologically Inspired Robotics , 2007, Science.
[6] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[7] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .
[8] Patrick M. Pilarski,et al. Model-Free reinforcement learning with continuous action in practice , 2012, 2012 American Control Conference (ACC).
[9] Pascal Vincent,et al. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.
[10] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .