DWA-RL: Dynamically Feasible Deep Reinforcement Learning Policy for Robot Navigation among Mobile Obstacles

We present a novel Deep Reinforcement Learning (DRL) based policy to compute dynamically feasible and spatially aware velocities for a robot navigating among mobile obstacles. Our approach combines the benefits of the Dynamic Window Approach (DWA) in terms of satisfying the robot’s dynamics constraints with state-of-the-art DRL-based navigation methods that can handle moving obstacles and pedestrians well. Our formulation achieves these goals by embedding the environmental obstacles’ motions in a novel low-dimensional observation space. It also uses a novel reward function to positively reinforce velocities that move the robot away from the obstacle’s heading direction leading to significantly lower number of collisions. We evaluate our method in realistic 3-D simulated environments and on a real differential drive robot in challenging dense indoor scenarios with several walking pedestrians. We compare our method with state-of-the-art collision avoidance methods and observe significant improvements in terms of success rate (up to 33% increase), number of dynamics constraint violations (up to 61% decrease), and smoothness. We also conduct ablation studies to highlight the advantages of our observation space formulation, and reward structure.

[1]  Jonathan P. How,et al.  Collision Avoidance in Pedestrian-Rich Environments With Deep Reinforcement Learning , 2019, IEEE Access.

[2]  Jia Pan,et al.  Fully Distributed Multi-Robot Collision Avoidance via Deep Reinforcement Learning for Safe and Efficient Navigation in Complex Scenarios , 2018, ArXiv.

[3]  Jonathan P. How,et al.  Socially aware motion planning with deep reinforcement learning , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4]  Dinesh Manocha,et al.  Parameter estimation and comparative evaluation of crowd simulations , 2014, Comput. Graph. Forum.

[5]  Jonathan P. How,et al.  Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Manuel Ocaña,et al.  Dynamic window based approaches for avoiding obstacles in moving , 2019, Robotics Auton. Syst..

[7]  Sen Wang,et al.  Towards Monocular Vision based Obstacle Avoidance through Deep Reinforcement Learning , 2017, RSS 2017.

[8]  Ivan Petrovic,et al.  Dynamic window based approach to mobile robot motion control in the presence of moving obstacles , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[9]  Wolfram Burgard,et al.  Socially Compliant Navigation Through Raw Depth Inputs with Generative Adversarial Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[10]  S. LaValle Rapidly-exploring random trees : a new tool for path planning , 1998 .

[11]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[12]  Maren Bennewitz,et al.  Predictive Collision Avoidance for the Dynamic Window Approach , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[13]  Dinesh Manocha,et al.  Frozone: Freezing-Free, Pedestrian-Friendly Navigation in Human Crowds , 2020, IEEE Robotics and Automation Letters.

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Georges Bastin,et al.  Modelling and control of non-holonomic wheeled mobile robots , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.

[16]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[17]  Wolfram Burgard,et al.  The dynamic window approach to collision avoidance , 1997, IEEE Robotics Autom. Mag..

[18]  Ye-Hoon Kim,et al.  End-to-end deep learning for autonomous navigation of mobile robot , 2018, 2018 IEEE International Conference on Consumer Electronics (ICCE).

[19]  Hao Zhang,et al.  Towards Optimally Decentralized Multi-Robot Collision Avoidance via Deep Reinforcement Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Reid G. Simmons,et al.  The curvature-velocity method for local obstacle avoidance , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[21]  Jonathan P. How,et al.  Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22]  Feng-Li Lian,et al.  Characterizing Indoor Environment for Robot Navigation Using Velocity Space Approach With Region Analysis and Look-Ahead Verification , 2011, IEEE Transactions on Instrumentation and Measurement.

[23]  Christian R. Shelton,et al.  Balancing Multiple Sources of Reward in Reinforcement Learning , 2000, NIPS.

[24]  Dinesh Manocha,et al.  DenseCAvoid: Real-time Navigation in Dense Crowds using Anticipatory Behaviors , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..