论文信息 - A strategy for push recovery in quadruped robot based on reinforcement learning

A strategy for push recovery in quadruped robot based on reinforcement learning

In this paper, a strategy for push recovery in quadruped robot based on reinforcement learning(RL) is proposed. At first, this strategy makes use of the simplified model of quadruped robot to reduce the dimensions of the action and state space for the RL framework, then it enhance the efficiency of the arithmetic by using the prior knowledge provided by the simplified model. Through learning process, this strategy can provide a foot placement estimate to the quadruped robot to restore balance while being pushed. By compared with the traditional arithmetic on a united simulation platform, we prove that this arithmetic is available, and can converge at the result quickly.

Jian Wang | Yang-zhen Chen | Wen-Qi Hou | Jian-Wen Wang | Hong-xu Ma

[1] Chen Xing,et al. Kernel-Based Continuous-Action Actor-Critic Learning , 2014 .

[2] P. Abbeel,et al. Reinforcement Learning Methods to Enable Automatic Tuning of Legged Robots , 2012 .

[3] Byoung-Tak Zhang,et al. Online learning of low dimensional strategies for high-level push recovery in bipedal humanoid robots , 2013, 2013 IEEE International Conference on Robotics and Automation.

[4] Jun Morimoto,et al. A simple reinforcement learning algorithm for biped walking , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[5] Shuuji Kajita,et al. Towards an Optimal Falling Motion for a Humanoid Robot , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[6] Jun Morimoto,et al. Robot learning [TC Spotlight] , 2009, IEEE Robotics & Automation Magazine.

[7] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[8] Jan Peters,et al. Toward fast policy search for learning legged locomotion , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9] J. Peters,et al. Using Reward-weighted Regression for Reinforcement Learning of Task Space Control , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[10] Jun Morimoto,et al. Learning Biped Locomotion , 2007, IEEE Robotics & Automation Magazine.

[11] Wan Kyun Chung,et al. Gait planning for quadruped robot based on dynamic stability: landing accordance ratio , 2009, Intell. Serv. Robotics.

[12] Sergey V. Drakunov,et al. Capture Point: A Step toward Humanoid Push Recovery , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[13] Sridhar Mahadevan,et al. Robot Learning , 1993 .

[14] Eric Kubica,et al. Introduction of the Foot Placement Estimator: A Dynamic Measure of Balance for Bipedal Robotics , 2008 .

[15] Roland Siegwart,et al. Reinforcement learning of single legged locomotion , 2013, IROS 2013.

[16] Twan Koolen,et al. Capturability-based analysis and control of legged locomotion, Part 1: Theory and application to three simple gait models , 2011, Int. J. Robotics Res..

[17] Sylvain Calinon,et al. Challenges in adapting imitation and reinforcement learning to compliant robots , 2011 .

[18] Darwin G. Caldwell,et al. Challenges for the policy representation when applying reinforcement learning in robotics , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[19] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20] Stefan Schaal,et al. Learning to Control in Operational Space , 2008, Int. J. Robotics Res..

[21] Daniele Nardi,et al. Policy gradient learning for quadruped soccer robots , 2010, Robotics Auton. Syst..