Convergence Analysis of Reinforcement Learning Approaches to Humanoid Locomotion
暂无分享,去创建一个
[1] Shin Ishii,et al. Part 4: Reinforcement learning: Machine learning and natural learning , 2006, New Generation Computing.
[2] Shalabh Bhatnagar,et al. Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation , 2009, NIPS.
[3] Bernard Espiau,et al. A Study of the Passive Gait of a Compass-Like Biped Robot , 1998, Int. J. Robotics Res..
[4] P. Dayan,et al. Dopamine, uncertainty and TD learning , 2005, Behavioral and Brain Functions.
[5] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .
[6] G. Sandini,et al. The iCub cognitive architecture: Interactive development in a humanoid robot , 2007, 2007 IEEE 6th International Conference on Development and Learning.
[7] Jun Morimoto,et al. Learning CPG-based biped locomotion with a policy gradient method , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..
[8] Jun Morimoto,et al. Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..
[9] Charles W. Anderson,et al. Strategy Learning with Multilayer Connectionist Representations , 1987 .
[10] Richard S. Sutton,et al. A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation , 2008, NIPS.
[11] Russell L. Tedrake,et al. Applied optimal control for dynamically stable legged locomotion , 2004 .
[12] A. V. Lensky,et al. Dynamic Walking of a Vehicle With Two Telescopic Legs Controlled by Two Drives , 1994, Int. J. Robotics Res..
[13] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[14] Robert Babuska,et al. Reinforcement Learning Control for Biped Robot Walking on Uneven Surfaces , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.
[15] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[16] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[17] Jun Morimoto,et al. Learning CPG-based biped locomotion with a policy gradient method , 2005, Humanoids.
[18] Shin Ishii,et al. Reinforcement Learning for Biped Locomotion , 2002, ICANN.
[19] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[20] Kagan Tumer,et al. Unifying temporal and structural credit assignment problems , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[21] Judy A. Franklin,et al. Biped dynamic walking using reinforcement learning , 1997, Robotics Auton. Syst..
[22] J. Meditch,et al. Applied optimal control , 1972, IEEE Transactions on Automatic Control.
[23] Yasuhisa Hasegawa,et al. Self scaling reinforcement learning for fuzzy logic controller , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.
[24] Vijaykumar Gullapalli,et al. A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.