Robust Quadrupedal Locomotion on Sloped Terrains: A Linear Policy Approach

In this paper, with a view toward fast deployment of locomotion gaits in low-cost hardware, we use a linear policy for realizing end-foot trajectories in the quadruped robot, Stoch $2$. In particular, the parameters of the end-foot trajectories are shaped via a linear feedback policy that takes the torso orientation and the terrain slope as inputs. The corresponding desired joint angles are obtained via an inverse kinematics solver and tracked via a PID control law. Augmented Random Search, a model-free and a gradient-free learning algorithm is used to train this linear policy. Simulation results show that the resulting walking is robust to external pushes and terrain slope variations. This methodology is not only computationally light-weight but also uses minimal sensing and actuation capabilities in the robot, thereby justifying the approach.

[1]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[2]  Joonho Lee,et al.  DeepGait: Planning and Control of Quadrupedal Gaits Using Deep Reinforcement Learning , 2020, IEEE Robotics and Automation Letters.

[3]  Marco Hutter,et al.  Gait and Trajectory Optimization for Legged Systems Through Phase-Based End-Effector Parameterization , 2018, IEEE Robotics and Automation Letters.

[4]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[5]  Ashish Joglekar,et al.  Trajectory based Deep Policy Search for Quadrupedal Walking , 2019, 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN).

[6]  C. Karen Liu,et al.  Learning symmetric and low-energy locomotion , 2018, ACM Trans. Graph..

[7]  Daniel E. Koditschek,et al.  Hybrid zero dynamics of planar biped walkers , 2003, IEEE Trans. Autom. Control..

[8]  Michiel van de Panne,et al.  ALLSTEPS: Curriculum‐driven Learning of Stepping Stone Skills , 2020, Comput. Graph. Forum.

[9]  Pieter Abbeel,et al.  Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion , 2007, NIPS.

[10]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[11]  Ludovic Righetti,et al.  An Open Torque-Controlled Modular Robot Architecture for Legged Locomotion Research , 2019, IEEE Robotics and Automation Letters.

[12]  Shuuji Kajita,et al.  Legged Robots , 2008, Springer Handbook of Robotics.

[13]  Jochen J. Steil,et al.  Oncilla Robot: A Versatile Open-Source Quadruped Research Robot With Compliant Pantograph Legs , 2018, Front. Robot. AI.

[14]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[15]  Roland Siegwart,et al.  Dynamic trotting on slopes for quadrupedal robots , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[16]  Benjamin Recht,et al.  Simple random search of static linear policies is competitive for reinforcement learning , 2018, NeurIPS.

[17]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[18]  W. Hager,et al.  and s , 2019, Shallow Water Hydraulics.

[19]  Joonho Lee,et al.  Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[20]  Peter Fankhauser,et al.  Dynamic locomotion and whole-body control for quadrupedal robots , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[21]  Ashish Joglekar,et al.  Learning Stable Manoeuvres in Quadruped Robots from Expert Demonstrations , 2020, 2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN).

[22]  Patrick Slade,et al.  Stanford Doggo: An Open-Source, Quasi-Direct-Drive Quadruped , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[23]  Sunmin Lee,et al.  Learning predict-and-simulate policies from unorganized human motion data , 2019, ACM Trans. Graph..

[24]  Atil Iscen,et al.  Policies Modulating Trajectory Generators , 2018, CoRL.

[25]  Peter Stone,et al.  Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[26]  Donghyun Kim,et al.  Highly Dynamic Quadruped Locomotion via Whole-Body Impulse Control and Model Predictive Control , 2019, ArXiv.

[27]  Bharadwaj S. Amrutur,et al.  Design, Development and Experimental Realization of A Quadrupedal Research Platform: Stoch , 2019, 2019 5th International Conference on Control, Automation and Robotics (ICCAR).

[28]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[29]  David Surovik,et al.  Reliable Trajectories for Dynamic Quadrupeds using Analytical Costs and Learned Initializations , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[30]  Ferdinando Cannella,et al.  Design of HyQ – a hydraulically and electrically actuated quadruped robot , 2011 .

[31]  Marc Pollefeys,et al.  PIXHAWK: A system for autonomous flight using onboard computer vision , 2011, 2011 IEEE International Conference on Robotics and Automation.

[32]  Ashish Joglekar,et al.  Gait Library Synthesis for Quadruped Robots via Augmented Random Search , 2019, ArXiv.

[33]  Sham M. Kakade,et al.  Towards Generalization and Simplicity in Continuous Control , 2017, NIPS.

[34]  Marc Pollefeys,et al.  PX4: A node-based multithreaded open source robotics framework for deeply embedded platforms , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Sangbae Kim,et al.  MIT Cheetah 3: Design and Control of a Robust, Dynamic Quadruped Robot , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[36]  Miomir Vukobratovic,et al.  Zero-Moment Point - Thirty Five Years of its Life , 2004, Int. J. Humanoid Robotics.