Robot Path Planning in Dynamic Environments Based on Deep Reinforcement Learning

Path planning in dynamic environment has been the hot research direction. This paper considers a new dynamic environment—the obstacles are randomly distributed in the environment, and all of the obstacles will be distributed randomly again after robot’s movements. In the new dynamic environment, traditional path planning methods have some shortcomings when facing the dynamic environments. The traditional path planning algorithms need to re-calculate the path once the environments change, which is a very time consuming process. The deep reinforcement learning (DRL) model is a single-step algorithm, so the dynamic environments will not affect its running time consumption, which is superior to the traditional path planning algorithms in terms of running time consumption. However, the DRL model will face the problem of sparse rewards in the path planning problem due to the large state space of the environments. This paper uses DRL to solve the shortcomings of traditional path planning algorithms in dynamic environments and we propose a new framework to solve the problem of sparse reward in robot path planning. The framework uses a new strategy searching algorithm and a new shaped reward function. The improved framework can effectively solve the convergence problem in path planning. According to the simulation results, in the stochastic dynamic environments, the running time consumption of the new framework is less than the traditional path planning algorithm, and the new framework is better the classic DRL model in training results and planning results.

[1]  Liu Ming,et al.  A robot exploration strategy based on Q-learning network , 2016 .

[2]  Andy Ju An Wang,et al.  Path Planning for Virtual Human Motion Using Improved A* Star Algorithm , 2010, 2010 Seventh International Conference on Information Technology: New Generations.

[3]  Min Chen,et al.  An adaptive deep Q-learning strategy for handwritten digit recognition , 2018, Neural Networks.

[4]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[5]  Pericles A. Mitkas,et al.  Deep Reinforcement Learning for Doom using Unsupervised Auxiliary Tasks , 2018, ArXiv.

[6]  Sergey Levine,et al.  Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.

[7]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[8]  Yoshifumi Kitamura,et al.  3-D path planning in a dynamic environment using an octree and an artificial potential field , 1995, Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots.

[9]  Fedor A. Kolushev,et al.  Multi-agent Optimal Path Planning for Mobile Robots in Environment with Obstacles , 1999, Ershov Memorial Conference.

[10]  Ismaïl Chabini,et al.  Adaptations of the A* algorithm for the computation of fastest paths in deterministic discrete-time dynamic networks , 2002, IEEE Trans. Intell. Transp. Syst..

[11]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[12]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[13]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[14]  Marcin Andrychowicz,et al.  Hindsight Experience Replay , 2017, NIPS.

[15]  Zhian Zhang,et al.  Dynamic Path Planning of Unknown Environment Based on Deep Reinforcement Learning , 2018, J. Robotics.

[16]  Alexei A. Efros,et al.  Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[17]  Joelle Pineau,et al.  Socially Adaptive Path Planning in Human Environments Using Inverse Reinforcement Learning , 2016, Int. J. Soc. Robotics.

[18]  Dilek Z. Hakkani-Tür,et al.  FollowNet: Robot Navigation by Following Natural Language Directions with Deep Reinforcement Learning , 2018, ArXiv.

[19]  Martin A. Riedmiller,et al.  Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.

[20]  Aleksandr I. Panov,et al.  Grid Path Planning with Deep Reinforcement Learning: Preliminary Results , 2017, BICA.