Robot Navigation in Constrained Pedestrian Environments using Reinforcement Learning

Navigating fluently around pedestrians is a necessary capability for mobile robots deployed in human environments, such as buildings and homes. While research on social navigation has focused mainly on the scalability with the number of pedestrians in open spaces, typical indoor environments present the additional challenge of constrained spaces such as corridors and doorways that limit maneuverability and influence patterns of pedestrian interaction. We present an approach based on reinforcement learning (RL) to learn policies capable of dynamic adaptation to the presence of moving pedestrians while navigating between desired locations in constrained environments. The policy network receives guidance from a motion planner that provides waypoints to follow a globally planned trajectory, whereas RL handles the local interactions. We explore a compositional principle for multi-layout training and find that policies trained in a small set of geometrically simple layouts successfully generalize to more complex unseen layouts that exhibit composition of the structural elements available during training. Going beyond walls-world like domains, we show transfer of the learned policy to unseen 3D reconstructions of two real environments. These results support the applicability of the compositional principle to navigation in real-world buildings and indicate promising usage of multi-agent simulation within reconstructed environments for tasks that involve interaction.

[1]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[2]  Jitendra Malik,et al.  Gibson Env: Real-World Perception for Embodied Agents , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Julie A. Shah,et al.  Human-robot co-navigation using anticipatory indicators of human walking motion , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Jonathan P. How,et al.  Socially aware motion planning with deep reinforcement learning , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  William D. Smart,et al.  Layered costmaps for context-sensitive navigation , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Andreas Krause,et al.  Unfreezing the robot: Navigation in dense, interacting crowds , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Rachid Alami,et al.  Viewing Robot Navigation in Human Environment as a Cooperative Activity , 2017, ISRR.

[8]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[9]  Ross A. Knepper,et al.  Effects of Distinct Robot Navigation Strategies on Human Behavior in a Crowded Environment , 2019, 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[10]  Russell A. Epstein,et al.  The cognitive map in humans: spatial navigation and beyond , 2017, Nature Neuroscience.

[11]  Silvio Savarese,et al.  Deep Local Trajectory Replanning and Control for Robot Navigation , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[12]  Alex Graves,et al.  Automated Curriculum Learning for Neural Networks , 2017, ICML.

[13]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[14]  Jonathan P. How,et al.  Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Ming Liu,et al.  Robot Navigation in Crowds by Graph Convolutional Networks With Attention Learned From Human Gaze , 2019, IEEE Robotics and Automation Letters.

[16]  John Schulman,et al.  Teacher–Student Curriculum Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Allison M. Okamura,et al.  Efficient and Trustworthy Social Navigation via Explicit and Implicit Robot–Human Communication , 2018, IEEE Transactions on Robotics.

[18]  Pieter Abbeel,et al.  Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.

[19]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[20]  Henry Zhu,et al.  Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[21]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[22]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[23]  Alexandre Alahi,et al.  Crowd-Robot Interaction: Crowd-Aware Robot Navigation With Attention-Based Deep Reinforcement Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[24]  Dinesh Manocha,et al.  Reciprocal n-Body Collision Avoidance , 2011, ISRR.

[25]  Yann LeCun,et al.  Off-Road Obstacle Avoidance through End-to-End Learning , 2005, NIPS.

[26]  Silvio Savarese,et al.  Interactive Gibson: A Benchmark for Interactive Navigation in Cluttered Environments , 2019, ArXiv.

[27]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[28]  Wolfram Burgard,et al.  The dynamic window approach to collision avoidance , 1997, IEEE Robotics Autom. Mag..

[29]  Minsu Kim,et al.  Fast Adaptation of Deep Reinforcement Learning-Based Navigation Skills to Human Preference , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[30]  Andreas Krause,et al.  Robot navigation in dense human crowds: Statistical models and experimental studies of human–robot cooperation , 2015, Int. J. Robotics Res..

[31]  Lakhmi C. Jain,et al.  Path Planning and Obstacle Avoidance for Autonomous Mobile Robots: A Review , 2006, KES.

[32]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[33]  Wei Gao,et al.  Intention-Net: Integrating Planning and Deep Learning for Goal-Directed Autonomous Navigation , 2017, CoRL.

[34]  Michael W. Otte A Survey of Machine Learning Approaches to Robotic Path-Planning , 2009 .