Energy-Based Legged Robots Terrain Traversability Modeling via Deep Inverse Reinforcement Learning

This work reports ondeveloping a deep inverse reinforcement learning method for legged robots terrain traversability modeling that incorporates both exteroceptive and proprioceptive sensory data. Existing works use robot-agnostic exteroceptive environmental features or handcrafted kinematic features; instead, we propose to also learn robot-specific inertial features from proprioceptive sensory data for reward approximation in a single deep neural network. Incorporating the inertial features can improve the model fidelity and provide a reward that depends on the robot’s state during deployment. We train the reward network using the Maximum Entropy Deep Inverse Reinforcement Learning (MEDIRL) algorithm and propose simultaneously minimizing a trajectory ranking loss to deal with the suboptimality of legged robot demonstrations. The demonstrated trajectories are ranked by locomotion energy consumption, in order to learn an energy-aware reward function and a more energy-efficient policy than demonstration. We evaluate our method using a dataset collected by an MIT Mini-Cheetah robot and a Mini-Cheetah simulator. The code is publicly available.1

[1]  Jessy W. Grizzle,et al.  Efficient Anytime CLF Reactive Planning System for a Bipedal Robot on Undulating Terrain , 2021, IEEE Transactions on Robotics.

[2]  Jeffrey M. Walls,et al.  Multitask Learning for Scalable and Dense Multilayer Bayesian Map Inference , 2021, IEEE Transactions on Robotics.

[3]  Di Chen,et al.  An Error-State Model Predictive Control on Connected Matrix Lie Groups for Legged Robot Control , 2022, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4]  David D. Fan,et al.  Hybrid Imitative Planning with Geometric and Predictive Costs in Off-road Environments , 2021, 2022 International Conference on Robotics and Automation (ICRA).

[5]  Krzysztof Walas,et al.  Navigating by touch: haptic Monte Carlo localization via geometric sensing and terrain classification , 2021, Autonomous Robots.

[6]  Maani Ghaffari,et al.  Legged Robot State Estimation using Invariant Kalman Filtering and Learned Contact Events , 2021, CoRL.

[7]  David Hyunchul Shim,et al.  Incorporating Multi-Context Into the Traversability Map for Urban Autonomous Driving Using Deep Inverse Reinforcement Learning , 2021, IEEE Robotics and Automation Letters.

[8]  Jessy W. Grizzle,et al.  Toward Safety-Aware Informative Motion Planning for Legged Robots , 2021, ArXiv.

[9]  David D. Fan,et al.  STEP: Stochastic Traversability Evaluation and Planning for Safe Off-road Navigation , 2021, Robotics: Science and Systems.

[10]  Nikolay Atanasov,et al.  Inverse reinforcement learning for autonomous navigation via differentiable semantic mapping and planning , 2021, Autonomous Robots.

[11]  Giovanni Muscato,et al.  Learning-Based Methods of Perception and Navigation for Ground Vehicles in Unstructured Environments: A Review , 2020, Sensors.

[12]  S. Levine,et al.  BADGR: An Autonomous Self-Supervised Learning-Based Navigation System , 2020, IEEE Robotics and Automation Letters.

[13]  Prashant Doshi,et al.  A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress , 2018, Artif. Intell..

[14]  Anh Nguyen,et al.  Autonomous Navigation in Complex Environments with Deep Multimodal Fusion Network , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Yuichi Kobayashi,et al.  Regressed Terrain Traversability Cost for Autonomous Navigation Based on Image Textures , 2020, Applied Sciences.

[16]  M. Trivedi,et al.  Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans , 2020, ArXiv.

[17]  Huijing Zhao,et al.  Off-road Autonomous Vehicles Traversability Analysis and Trajectory Planning Based on Deep Inverse Reinforcement Learning , 2019, 2020 IEEE Intelligent Vehicles Symposium (IV).

[18]  J. Grizzle,et al.  Bayesian Spatial Kernel Smoothing for Scalable Dense Semantic Mapping , 2019, IEEE Robotics and Automation Letters.

[19]  Jan Faigl,et al.  On Unsupervised Learning of Traversal Cost and Terrain Types Identification Using Self-organizing Maps , 2019, ICANN.

[20]  Scott Niekum,et al.  Better-than-Demonstrator Imitation Learning via Automatically-Ranked Demonstrations , 2019, CoRL.

[21]  Krzysztof Walas,et al.  What am I touching? Learning to classify terrain via haptic sensing , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[22]  Prabhat Nagarajan,et al.  Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations , 2019, ICML.

[23]  Krzysztof Walas,et al.  Where Should I Walk? Predicting Terrain Properties From Images Via Self-Supervised Learning , 2019, IEEE Robotics and Automation Letters.

[24]  Gamini Dissanayake,et al.  Sampling-based incremental information gathering with applications to robotic exploration and environmental monitoring , 2016, Int. J. Robotics Res..

[25]  Brendan Englot,et al.  Bayesian Generalized Kernel Inference for Terrain Traversability Mapping , 2018, CoRL.

[26]  Sebastian Scherer,et al.  Integrating kinematics and environment context into deep inverse reinforcement learning for predicting off-road vehicle trajectories , 2018, CoRL.

[27]  Marco Hutter,et al.  Probabilistic Terrain Mapping for Mobile Robots With Uncertain Localization , 2018, IEEE Robotics and Automation Letters.

[28]  Pieter Abbeel,et al.  An Algorithmic Perspective on Imitation Learning , 2018, Found. Trends Robotics.

[29]  Qingjie Liu,et al.  Road Extraction by Deep Residual U-Net , 2017, IEEE Geoscience and Remote Sensing Letters.

[30]  Dushyant Rao,et al.  Large-scale cost function learning for path planning using deep inverse reinforcement learning , 2017, Int. J. Robotics Res..

[31]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[32]  Ingmar Posner,et al.  Find your own way: Weakly-supervised segmentation of path proposals for urban autonomy , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Krzysztof Walas,et al.  Terrain classification and locomotion parameters adaptation for humanoid robots using force/torque sensing , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[34]  Markus Wulfmeier,et al.  Maximum Entropy Deep Inverse Reinforcement Learning , 2015, 1507.04888.

[35]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[36]  Panagiotis Papadakis,et al.  Terrain traversability analysis methods for unmanned ground vehicles: A survey , 2013, Eng. Appl. Artif. Intell..

[37]  Martial Hebert,et al.  Activity Forecasting , 2012, ECCV.

[38]  David Silver,et al.  Learning from Demonstration for Autonomous Navigation in Complex Unstructured Terrain , 2010, Int. J. Robotics Res..

[39]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[40]  Pieter Abbeel,et al.  Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion , 2007, NIPS.