Learning to Navigate Sidewalks in Outdoor Environments

Outdoor navigation on sidewalks in urban environments is the key technology behind important human assistive applications, such as last-mile delivery or neighborhood patrol. This paper aims to develop a quadruped robot that follows a route plan generated by public map services, while remaining on sidewalks and avoiding collisions with obstacles and pedestrians. We devise a two-staged learning framework, which first trains a teacher agent in an abstract world with privileged ground-truth information, and then applies Behavior Cloning to teach the skills to a student agent who only has access to realistic sensors. The main research effort of this paper focuses on overcoming challenges when deploying the student policy on a quadruped robot in the real world. We propose methodologies for designing sensing modalities, network architectures, and training procedures to enable zeroshot policy transfer to unstructured and dynamic real outdoor environments. We evaluate our learning framework on a quadrupedal robot navigating sidewalks in the city of Atlanta, USA. Using the learned navigation policy and its onboard sensors, the robot is able to walk 3.2 kilometers with a limited number of human interventions. Project webpage: https://initmaks.com/navigation

[1]  Vladlen Koltun,et al.  Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning , 2020, ICML.

[2]  Takashi Tsubouchi,et al.  Autonomous robot navigation in outdoor cluttered pedestrian walkways , 2009, J. Field Robotics.

[3]  Rutav Shah,et al.  RRL: Resnet as representation for Reinforcement Learning , 2021, ICML.

[4]  Aleksandra Faust,et al.  Long-Range Indoor Navigation With PRM-RL , 2020, IEEE Transactions on Robotics.

[5]  Henry Zhu,et al.  Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[6]  Joelle Pineau,et al.  Improving Sample Efficiency in Model-Free Reinforcement Learning from Images , 2019, AAAI.

[7]  Silvio Savarese,et al.  iGibson 1.0: A Simulation Environment for Interactive Tasks in Large Realistic Scenes , 2020, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[9]  Sergey Levine,et al.  Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Silvio Savarese,et al.  Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[12]  Sehoon Ha,et al.  Learning Human Search Behavior from Egocentric Visual Inputs , 2020, ArXiv.

[13]  Rahul Sukthankar,et al.  Cognitive Mapping and Planning for Visual Navigation , 2017, International Journal of Computer Vision.

[14]  Bernt Schiele,et al.  Not Using the Car to See the Sidewalk — Quantifying and Controlling the Effects of Context in Classification and Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Peter Stone,et al.  VOILA: Visual-Observation-Only Imitation Learning for Autonomous Navigation , 2021, 2022 International Conference on Robotics and Automation (ICRA).

[16]  Pieter Abbeel,et al.  Reinforcement Learning with Augmented Data , 2020, NeurIPS.

[17]  Vladlen Koltun,et al.  Megaverse: Simulating Embodied Agents at One Million Experiences per Second , 2021, ICML.

[18]  Sehoon Ha,et al.  Observation Space Matters: Benchmark and Optimization Algorithm , 2020, ArXiv.

[19]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Vladlen Koltun,et al.  Does computer vision matter for action? , 2019, Science Robotics.

[21]  Vladlen Koltun,et al.  Learning by Cheating , 2019, CoRL.

[22]  Pieter Abbeel,et al.  LaND: Learning to Navigate From Disengagements , 2020, IEEE Robotics and Automation Letters.

[23]  Rohit Mohan,et al.  EfficientPS: Efficient Panoptic Segmentation , 2020, International Journal of Computer Vision.

[24]  Vladlen Koltun,et al.  Playing for Benchmarks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[25]  Gang Yu,et al.  BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation , 2018, ECCV.

[26]  Bernard Ghanem,et al.  Driving Policy Transfer via Modularity and Abstraction , 2018, CoRL.

[27]  Lorenz Wellhausen,et al.  Learning a State Representation and Navigation in Cluttered and Dynamic Environments , 2021, IEEE Robotics and Automation Letters.

[28]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[29]  Ruslan Salakhutdinov,et al.  Learning to Explore using Active Neural SLAM , 2020, ICLR.

[30]  Lorenz Wellhausen,et al.  Learning quadrupedal locomotion over challenging terrain , 2020, Science Robotics.

[31]  Jitendra Malik,et al.  Habitat: A Platform for Embodied AI Research , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[33]  Wolfram Burgard,et al.  Autonomous Robot Navigation in Highly Populated Pedestrian Zones , 2015, J. Field Robotics.

[34]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).