Stochastic policy gradient reinforcement learning on a simple 3D biped
暂无分享,去创建一个
[1] Tad McGeer,et al. Passive Dynamic Walking , 1990, Int. J. Robotics Res..
[2] W.T. Miller. Real-time neural network control of a biped walking robot , 1994, IEEE Control Systems.
[3] Judy A. Franklin,et al. Biped dynamic walking using reinforcement learning , 1997, Robotics Auton. Syst..
[4] Andy Ruina,et al. An Uncontrolled Toy That Can Walk But Cannot Stand Still , 1997, physics/9711006.
[5] M. Coleman,et al. An Uncontrolled Walking Toy That Cannot Stand Still , 1998 .
[6] Shigenobu Kobayashi,et al. An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function , 1998, ICML.
[7] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[8] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[9] Martijn Wisse,et al. A Three-Dimensional Passive-Dynamic Walking Robot with Two Legs and Knees , 2001, Int. J. Robotics Res..
[10] Jun Morimoto,et al. Minimax Differential Dynamic Programming: An Application to Robust Biped Walking , 2002, NIPS.
[11] H. Sebastian Seung,et al. Actuating a simple 3D passive dynamic walker , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[12] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[13] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.