UAV Motion Strategies in Uncertain Dynamic Environments: A Path Planning Method Based on Q-Learning Strategy

A solution framework for UAV motion strategies in uncertain dynamic environments is constructed in this paper. Considering that the motion states of UAV might be influenced by some dynamic uncertainties, such as control strategies, flight environments, and any other bursting-out threats, we model the uncertain factors that might cause such influences to the path planning of the UAV, unified as an unobservable part of the system and take the acceleration together with the bank angle of the UAV as a control variable. Meanwhile, the cost function is chosen based on the tracking error, then the control instructions and flight path for UAV can be achieved. Then, the cost function can be optimized through Q-learning, and the best UAV action sequence for conflict avoidance under the moving threat environment can be obtained. According to Bellman’s optimization principle, the optimal action strategies can be obtained from the current confidence level. The method in this paper is more in line with the actual UAV path planning, since the generation of the path planning strategy at each moment takes into account the influence of the UAV control strategy on its motion at the next moment. The simulation results show that all the planning paths that are created according to the solution framework proposed in this paper have a very high tracking accuracy, and this method has a much shorter processing time as well as a shorter path it can create.

[1]  Hyochoong Bang,et al.  A Robust Terrain Aided Navigation Using the Rao-Blackwellized Particle Filter Trained by Long Short-Term Memory Networks , 2018, Sensors.

[2]  Oussama Khatib,et al.  Real-Time Obstacle Avoidance for Manipulators and Mobile Robots , 1986 .

[3]  Fang Liu,et al.  Chaotic artificial bee colony approach to Uninhabited Combat Air Vehicle (UCAV) path planning , 2010 .

[4]  Oussama Khatib,et al.  Real-Time Obstacle Avoidance for Manipulators and Mobile Robots , 1985, Autonomous Robot Vehicles.

[5]  Yunsheng Fan,et al.  An Automatic Navigation System for Unmanned Surface Vehicles in Realistic Sea Environments , 2018 .

[6]  Kimon P. Valavanis,et al.  Evolutionary algorithm based offline/online path planner for UAV navigation , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[7]  Lorenzo Ciani,et al.  Reliability Degradation, Preventive and Corrective Maintenance of UAV Systems , 2018, 2018 5th IEEE International Workshop on Metrology for AeroSpace (MetroAeroSpace).

[8]  Edwin K. P. Chong,et al.  UAV Path Planning in a Dynamic Environment via Partially Observable Markov Decision Process , 2013, IEEE Transactions on Aerospace and Electronic Systems.

[9]  Hyunsoon Kim,et al.  Formation Control Algorithm of Multi-UAV-Based Network Infrastructure , 2018, Applied Sciences.

[10]  D. T. Lee,et al.  Generalization of Voronoi Diagrams in the Plane , 1981, SIAM J. Comput..

[11]  Yongjia Zhao,et al.  A virtual-waypoint based artificial potential field method for UAV path planning , 2016, 2016 IEEE Chinese Guidance, Navigation and Control Conference (CGNCC).

[12]  Sajjad Haider,et al.  Path Planning in RoboCup Soccer Simulation 3D Using Evolutionary Artificial Neural Network , 2013, ICSI.

[13]  Ozgur Koray Sahingoz Generation of Bezier Curve-Based Flyable Trajectories for Multi-UAV Systems with Parallel Genetic Algorithm , 2014, J. Intell. Robotic Syst..

[14]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[15]  Yi Zeng,et al.  A Brain-Inspired Decision-Making Spiking Neural Network and Its Application in Unmanned Aerial Vehicle , 2018, Front. Neurorobot..

[16]  Frank L. Lewis,et al.  Off-Policy Interleaved $Q$ -Learning: Optimal Control for Affine Nonlinear Discrete-Time Systems , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Zhanshan Wang,et al.  Data-Based Optimal Control of Multiagent Systems: A Reinforcement Learning Design Approach , 2017, IEEE Transactions on Cybernetics.

[18]  Mingyue Ding,et al.  Route Planning for Unmanned Aerial Vehicle (UAV) on the Sea Using Hybrid Differential Evolution and Quantum-Behaved Particle Swarm Optimization , 2013, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[19]  Rahul Kala,et al.  Multi-robot path planning using co-evolutionary genetic programming , 2012, Expert Syst. Appl..

[20]  Kuo-Chu Chang,et al.  UAV Path Planning with Tangent-plus-Lyapunov Vector Field Guidance and Obstacle Avoidance , 2013, IEEE Transactions on Aerospace and Electronic Systems.

[21]  M. Pachter,et al.  Challenges of autonomous control , 1998 .

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23]  Ozgur Koray Sahingoz,et al.  A UAV path planning with parallel ACO algorithm on CUDA platform , 2014, 2014 International Conference on Unmanned Aircraft Systems (ICUAS).

[24]  Hao Zhang,et al.  A UAV Detection Algorithm Based on an Artificial Neural Network , 2018, IEEE Access.

[25]  Zhong Liu,et al.  Co-Optimization of Communication and Sensing for Multiple Unmanned Aerial Vehicles in Cooperative Target Tracking , 2018 .

[26]  Barbara Savini,et al.  Path Planning for Autonomous Vehicles by Trajectory Smoothing Using Motion Primitives , 2008, IEEE Transactions on Control Systems Technology.

[27]  Frank L. Lewis,et al.  Discrete-time dynamic graphical games: model-free reinforcement learning solution , 2015 .

[28]  Nafiz Arica,et al.  Comparison of 3D Versus 4D Path Planning for Unmanned Aerial Vehicles , 2016 .

[29]  Ian Postlethwaite,et al.  Real-time path planning with limited information for autonomous unmanned air vehicles , 2008, Autom..

[30]  Daobo Wang,et al.  UAV path planning method based on ant colony optimization , 2010, 2010 Chinese Control and Decision Conference.

[31]  Christoforos Kanellakis,et al.  Survey on Computer Vision for UAVs: Current Developments and Trends , 2017, Journal of Intelligent & Robotic Systems.

[32]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.