论文信息 - Dynamic value iteration networks for the planning of rapidly changing UAV swarms

Dynamic value iteration networks for the planning of rapidly changing UAV swarms

In an unmanned aerial vehicle ad-hoc network (UANET), sparse and rapidly mobile unmanned aerial vehicles (UAVs)/nodes can dynamically change the UANET topology. This may lead to UANET service performance issues. In this study, for planning rapidly changing UAV swarms, we propose a dynamic value iteration network (DVIN) model trained using the episodic Q-learning method with the connection information of UANETs to generate a state value spread function, which enables UAVs/nodes to adapt to novel physical locations. We then evaluate the performance of the DVIN model and compare it with the non-dominated sorting genetic algorithm II and the exhaustive method. Simulation results demonstrate that the proposed model significantly reduces the decision-making time for UAV/node path planning with a high average success rate.

[1] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[2] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[3] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[4] Pat Hanrahan,et al. Brook for GPUs: stream computing on graphics hardware , 2004, SIGGRAPH 2004.

[5] Hamed Haddadi,et al. Deep Learning in Mobile and Wireless Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[6] Stefan Schaal,et al. Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[7] Jaesung Lee,et al. Fast genetic algorithm for robot path planning , 2013 .

[8] Ismail Guvenc,et al. Receding Horizon Multi-UAV Cooperative Tracking of Moving RF Source , 2017, IEEE Communications Letters.

[9] Vincent Roberge,et al. Comparison of Parallel Genetic Algorithm and Particle Swarm Optimization for Real-Time UAV Path Planning , 2013, IEEE Transactions on Industrial Informatics.

[10] Ping Li,et al. Current trends in the development of intelligent unmanned autonomous systems , 2017, Frontiers of Information Technology & Electronic Engineering.

[11] Ilker Bekmezci,et al. Flying Ad-Hoc Networks (FANETs): A survey , 2013, Ad Hoc Networks.

[12] R. Bellman. Dynamic programming. , 1957, Science.

[13] Kalyanmoy Deb,et al. A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[14] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15] Peter Henderson,et al. An Introduction to Deep Reinforcement Learning , 2018, Found. Trends Mach. Learn..

[16] R. Bellman. Dynamic Programming , 1957, Science.