Acquiring of walking behavior for four-legged robots using actor-critic method based on policy gradient
暂无分享,去创建一个
[1] Wang Zhan-quan. Reinforcement Learning Theory,Algorithms and Application , 2006 .
[2] Jun Morimoto,et al. Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..
[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[4] Ralf Der,et al. A Sensor-Based Learning Algorithm for the Self-Organization of Robot Behavior , 2009, Algorithms.
[5] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[6] Jun Morimoto,et al. Learning CPG-based biped locomotion with a policy gradient method , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..
[7] Vijay R. Konda,et al. OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..