Natural Policy Gradient Reinforcement Learning for a CPG Control of a Biped Robot
暂无分享,去创建一个
[1] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[2] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[3] S. Grillner,et al. Neuronal network generating locomotor behavior in lamprey: circuitry, transmitters, membrane properties, and simulation. , 1991, Annual review of neuroscience.
[4] Shin Ishii,et al. Reinforcement Learning Based on On-Line EM Algorithm , 1998, NIPS.
[5] Hiroshi Shimizu,et al. Self-organized control of bipedal locomotion by neural oscillators in unpredictable environment , 1991, Biological Cybernetics.
[6] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[7] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .
[8] Michail G. Lagoudakis,et al. Least-Squares Methods in Reinforcement Learning for Control , 2002, SETN.
[9] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[10] Vijay R. Konda,et al. OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..
[11] Shin Ishii,et al. Reinforcement Learning for Biped Locomotion , 2002, ICANN.