Planning in Learned Latent Action Spaces for Generalizable Legged Locomotion

Hierarchical learning has been successful at learning generalizable locomotion skills on walking robots in a sample-efficient manner. However, the low-dimensional "latent" action used to communicate between different layers of the hierarchy is typically user-designed. In this work, we present a fully-learned hierarchical framework, that is capable of jointly learning the low-level controller and the high-level action space. Next, we plan over latent actions in a model-predictive control fashion, using a learned high-level dynamics model. This framework is generalizable to multiple robots, and we present results on a Daisy hexapod simulation, A1 quadruped simulation, and Daisy robot hardware. We compare a range of learned hierarchical approaches, and show that our framework is more reliable, versatile and sample-efficient. In addition to learning approaches, we also compare to an inverse-kinematics (IK) based footstep planner, and show that our fully-learned framework is competitive in performance with IK under normal conditions, and outperforms it in adverse settings. Our hardware experiments show the Daisy hexapod achieving multiple locomotion tasks, such as goal reaching, trajectory and velocity tracking in an unstructured outdoor setting, with only 2000 hardware samples.

[1]  Nicholas Rotella,et al.  An MPC Walking Framework with External Contact Forces , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Sergey Levine,et al.  Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.

[3]  Stefan Schaal,et al.  Learning, planning, and control for quadruped locomotion over challenging terrain , 2011, Int. J. Robotics Res..

[4]  Andrei Herdt,et al.  Online Walking Motion Generation with Automatic Footstep Placement , 2010, Adv. Robotics.

[5]  Pieter Abbeel,et al.  Meta Learning Shared Hierarchies , 2017, ICLR.

[6]  P. Alam,et al.  R , 1823, The Herodotus Encyclopedia.

[7]  Sergey Levine,et al.  SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning , 2018, ICML.

[8]  Jie Tan,et al.  Learning Agile Robotic Locomotion Skills by Imitating Animals , 2020, RSS 2020.

[9]  Christopher G. Atkeson,et al.  Bayesian Optimization Using Domain Knowledge on the ATRIAS Biped , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Jun Morimoto,et al.  Poincaré-Map-Based Reinforcement Learning For Biped Walking , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[11]  Sehoon Ha,et al.  Learning Fast Adaptation With Meta Strategy Optimization , 2020, IEEE Robotics and Automation Letters.

[12]  Olivier Stasse,et al.  Whole-body model-predictive control applied to the HRP-2 humanoid , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  J. Morimoto,et al.  Poincaré-Map-Based Reinforcement Learning For Biped Walking , 2005 .

[14]  Sergey Levine,et al.  Learning to Walk via Deep Reinforcement Learning , 2018, Robotics: Science and Systems.

[15]  Sergey Levine,et al.  Deep spatial autoencoders for visuomotor learning , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[16]  TedrakeRuss,et al.  Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot , 2016 .

[17]  Akshara Rai,et al.  Learning Generalizable Locomotion Skills with Hierarchical Reinforcement Learning , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Peter Secretan Learning , 1965, Mental Health.

[19]  Christopher G. Atkeson,et al.  Using Deep Reinforcement Learning to Learn High-Level Policies on the ATRIAS Biped , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[20]  C. Karen Liu,et al.  Sim-to-Real Transfer for Biped Locomotion , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[21]  Sergey Levine,et al.  Learning Flexible and Reusable Locomotion Primitives for a Microrobot , 2018, IEEE Robotics and Automation Letters.

[22]  Yee Whye Teh,et al.  Neural probabilistic motor primitives for humanoid control , 2018, ICLR.

[23]  Christopher G. Atkeson,et al.  Optimization‐based Full Body Control for the DARPA Robotics Challenge , 2015, J. Field Robotics.

[24]  Ali Ghodsi,et al.  Robust Locally-Linear Controllable Embedding , 2017, AISTATS.

[25]  Silvio Savarese,et al.  Interactive Gibson Benchmark: A Benchmark for Interactive Navigation in Cluttered Environments , 2020, IEEE Robotics and Automation Letters.

[26]  Franziska Meier,et al.  Learning Navigation Skills for Legged Robots with Learned Robot Embeddings , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[27]  Antoine Cully,et al.  Robots that can adapt like animals , 2014, Nature.

[28]  Martin A. Riedmiller,et al.  Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.

[29]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[30]  Sergey Levine,et al.  Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.

[31]  Vikash Kumar,et al.  Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real , 2019, CoRL.

[32]  Scott Kuindersma,et al.  Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot , 2015, Autonomous Robots.

[33]  Joonho Lee,et al.  DeepGait: Planning and Control of Quadrupedal Gaits Using Deep Reinforcement Learning , 2020, IEEE Robotics and Automation Letters.