论文信息 - DWA-RL: Dynamically Feasible Deep Reinforcement Learning Policy for Robot Navigation among Mobile Obstacles

DWA-RL: Dynamically Feasible Deep Reinforcement Learning Policy for Robot Navigation among Mobile Obstacles

We present a novel Deep Reinforcement Learning (DRL) based policy to compute dynamically feasible and spatially aware velocities for a robot navigating among mobile obstacles. Our approach combines the benefits of the Dynamic Window Approach (DWA) in terms of satisfying the robot’s dynamics constraints with state-of-the-art DRL-based navigation methods that can handle moving obstacles and pedestrians well. Our formulation achieves these goals by embedding the environmental obstacles’ motions in a novel low-dimensional observation space. It also uses a novel reward function to positively reinforce velocities that move the robot away from the obstacle’s heading direction leading to significantly lower number of collisions. We evaluate our method in realistic 3-D simulated environments and on a real differential drive robot in challenging dense indoor scenarios with several walking pedestrians. We compare our method with state-of-the-art collision avoidance methods and observe significant improvements in terms of success rate (up to 33% increase), number of dynamics constraint violations (up to 61% decrease), and smoothness. We also conduct ablation studies to highlight the advantages of our observation space formulation, and reward structure.

[1] Jonathan P. How,et al. Collision Avoidance in Pedestrian-Rich Environments With Deep Reinforcement Learning , 2019, IEEE Access.

[2] Jia Pan,et al. Fully Distributed Multi-Robot Collision Avoidance via Deep Reinforcement Learning for Safe and Efficient Navigation in Complex Scenarios , 2018, ArXiv.

[3] Jonathan P. How,et al. Socially aware motion planning with deep reinforcement learning , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4] Dinesh Manocha,et al. Parameter estimation and comparative evaluation of crowd simulations , 2014, Comput. Graph. Forum.

[5] Jonathan P. How,et al. Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[6] Manuel Ocaña,et al. Dynamic window based approaches for avoiding obstacles in moving , 2019, Robotics Auton. Syst..

[7] Sen Wang,et al. Towards Monocular Vision based Obstacle Avoidance through Deep Reinforcement Learning , 2017, RSS 2017.

[8] Ivan Petrovic,et al. Dynamic window based approach to mobile robot motion control in the presence of moving obstacles , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[9] Wolfram Burgard,et al. Socially Compliant Navigation Through Raw Depth Inputs with Generative Adversarial Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[10] S. LaValle. Rapidly-exploring random trees : a new tool for path planning , 1998 .

[11] Edsger W. Dijkstra,et al. A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[12] Maren Bennewitz,et al. Predictive Collision Avoidance for the Dynamic Window Approach , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[13] Dinesh Manocha,et al. Frozone: Freezing-Free, Pedestrian-Friendly Navigation in Human Crowds , 2020, IEEE Robotics and Automation Letters.

[14] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[15] Georges Bastin,et al. Modelling and control of non-holonomic wheeled mobile robots , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.

[16] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[17] Wolfram Burgard,et al. The dynamic window approach to collision avoidance , 1997, IEEE Robotics Autom. Mag..

[18] Ye-Hoon Kim,et al. End-to-end deep learning for autonomous navigation of mobile robot , 2018, 2018 IEEE International Conference on Consumer Electronics (ICCE).

[19] Hao Zhang,et al. Towards Optimally Decentralized Multi-Robot Collision Avoidance via Deep Reinforcement Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[20] Reid G. Simmons,et al. The curvature-velocity method for local obstacle avoidance , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[21] Jonathan P. How,et al. Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22] Feng-Li Lian,et al. Characterizing Indoor Environment for Robot Navigation Using Velocity Space Approach With Region Analysis and Look-Ahead Verification , 2011, IEEE Transactions on Instrumentation and Measurement.

[23] Christian R. Shelton,et al. Balancing Multiple Sources of Reward in Reinforcement Learning , 2000, NIPS.

[24] Dinesh Manocha,et al. DenseCAvoid: Real-time Navigation in Dense Crowds using Anticipatory Behaviors , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[25] Nils J. Nilsson,et al. A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..