A Path Planning Algorithm for UAV Based on Improved Q-Learning

In this paper, a new learning algorithm based on improved Q-learning is proposed using for path planning of Unmanned Aerial Vehicle (UAV) in an unknown antagonistic environment. According to the optimized object of UAV's task, the reward function is designed, and a new action selection strategy and a Q-function initialization method are used to improve the performance of the proposed algorithm. We use the STAGE Scenario simulation software as the training and validation environment, and a plug-in is designed to build up the link between the environment and learning algorithms. Finally, the experimental results show that the improved method is more effective than the original method, and the proposed algorithm is feasible and effective for UAV path planning.

[1]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[2]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[3]  Razvan Pascanu,et al.  Learning to Navigate in Complex Environments , 2016, ICLR.

[4]  Zhang Xiaoyi,et al.  Q learning algorithm based UAV path learning and obstacle avoidence approach , 2017, 2017 36th Chinese Control Conference (CCC).

[5]  Li Pinga A 3-D Route Planning Algorithm for Unmanned Aerial Vehicle Based on Q-Learning , 2012 .

[6]  Xin Xu,et al.  Dynamic path planning of a mobile robot with improved Q-learning algorithm , 2015, 2015 IEEE International Conference on Information and Automation.

[7]  Jonathan P. How,et al.  UAV Task Assignment , 2008, IEEE Robotics & Automation Magazine.

[8]  Shen Lin-chen Overview of Air Vehicle Mission Planning Techniques , 2014 .

[9]  Juan Wu,et al.  Effective lazy training method for deep q-network in obstacle avoidance and path planning , 2017, 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[10]  Chen Shi,et al.  Research on Reinforcement Learning Technology: A Review , 2004 .