论文信息 - Blind Bipedal Stair Traversal via Sim-to-Real Reinforcement Learning

Blind Bipedal Stair Traversal via Sim-to-Real Reinforcement Learning

Accurate and precise terrain estimation is a difficult problem for robot locomotion in real-world environments. Thus, it is useful to have systems that do not depend on accurate estimation to the point of fragility. In this paper, we explore the limits of such an approach by investigating the problem of traversing stair-like terrain without any external perception or terrain models on a bipedal robot. For such blind bipedal platforms, the problem appears difficult (even for humans) due to the surprise elevation changes. Our main contribution is to show that sim-to-real reinforcement learning (RL) can achieve robust locomotion over stair-like terrain on the bipedal robot Cassie using only proprioceptive feedback. Importantly, this only requires modifying an existing flat-terrain training RL framework to include stair-like terrain randomization, without any changes in reward function. To our knowledge, this is the first controller for a bipedal, human-scale robot capable of reliably traversing a variety of real-world stairs and other stair-like disturbances using only proprioception.

[1] David Silver,et al. Memory-based control with recurrent neural networks , 2015, ArXiv.

[2] Hartmut Geyer,et al. Swing-leg retraction: a simple control model for stable running , 2003, Journal of Experimental Biology.

[3] Darwin G. Caldwell,et al. Heuristic Planning for Rough Terrain Locomotion in Presence of External Disturbances and Variable Perception Quality , 2018, ECHORD++.

[4] Chee-Meng Chew,et al. Virtual Model Control: An Intuitive Approach for Bipedal Locomotion , 2001, Int. J. Robotics Res..

[5] Monica A. Daley,et al. Don't break a leg: running birds from quail to ostrich prioritise leg safety and economy on uneven terrain , 2014, Journal of Experimental Biology.

[6] Andrew A Biewener,et al. Running over rough terrain reveals limb control for intrinsic stability , 2006, Proceedings of the National Academy of Sciences.

[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[8] Susanne W. Lipfert,et al. Swing leg control in human running , 2010, Bioinspiration & biomimetics.

[9] Alan Fern,et al. Sim-to-Real Learning of All Common Bipedal Gaits via Periodic Reward Composition , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[10] Martin Buehler,et al. Stable Stair Climbing in a Simple Hexapod Robot , 2001 .

[11] Alan Fern,et al. Learning Memory-Based Control for Human-Scale Bipedal Locomotion , 2020, Robotics: Science and Systems.

[12] K. L. Poggensee,et al. Characterizing Swing-Leg Retraction in Human Locomotion , 2014 .

[13] Michiel van de Panne,et al. Learning locomotion skills using DeepRL: does the choice of action space matter? , 2016, Symposium on Computer Animation.

[14] Abderrahmane Kheddar,et al. Stair Climbing Stabilization of the HRP-4 Humanoid Robot using Whole-body Admittance Control , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[15] Marcin Andrychowicz,et al. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[16] Michael Suppa,et al. Detection of stair dimensions for the path planning of a bipedal robot , 2001, 2001 IEEE/ASME International Conference on Advanced Intelligent Mechatronics. Proceedings (Cat. No.01TH8556).

[17] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19] Jessy W. Grizzle,et al. A Finite-State Machine for Accommodating Unexpected Large Ground-Height Variations in Bipedal Robot Walking , 2013, IEEE Transactions on Robotics.

[20] Masahiro Fujita,et al. Stair climbing for humanoid robots using stereo vision , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[21] Lorenz Wellhausen,et al. Learning quadrupedal locomotion over challenging terrain , 2020, Science Robotics.

[22] Marco Ceccarelli,et al. Climbing stairs with EP-WAR2 biped robot , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[23] Glen Berseth,et al. Feedback Control For Cassie With Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[24] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[25] Takeo Kanade,et al. GPU-accelerated real-time 3D tracking for humanoid locomotion and stair climbing , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[26] Sangbae Kim,et al. MIT Cheetah 3: Design and Control of a Robust, Dynamic Quadruped Robot , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[27] Farzad Abdolhosseini,et al. On Learning Symmetric Locomotion , 2019, MIG.