Research on Node Deployment in Different Terrain of MANET Based on Relational Deep Reinforcement Learning

The deployment of mobile ad hoc network communication nodes is similar to the game process of go black and white. Using the deep reinforcement learning technology in AlphaZero algorithm, the deployment process of mobile ad hoc network nodes is abstracted as a chess game, and the deployment process is regarded as a direct game process between communication nodes and users. The difficulty and difference are that the chess board of the chess game does not change, while the grid application of mobile ad hoc network. The geographical environment changes with different application scenarios, and affects the deployment of communication nodes. This paper studies the deep reinforcement learning technology framework based on AlphaZero algorithm, and proposes that the relationship between geographical environment and communication node deployment can be implicitly deduced through the relationship reasoning technology, so as to solve the problem of node deployment in different terrain scenarios.

[1]  Luc De Raedt,et al.  Relational Reinforcement Learning , 1998, ILP.

[2]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[3]  Matthew Lai,et al.  Giraffe: Using Deep Reinforcement Learning to Play Chess , 2015, ArXiv.

[4]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[5]  R. Chavez-Santiago,et al.  Complexity Analysis of a Heuristic Method for Fixed-Frequency Assignment Including Adjacent Channel Interference , 2008, IEEE Transactions on Electromagnetic Compatibility.

[6]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[7]  Zhong Liu,et al.  Improving the optimization performance of NSGA-II algorithm by experiment design methods , 2012, 2012 IEEE International Conference on Computational Intelligence for Measurement Systems and Applications (CIMSA) Proceedings.

[8]  Shahrokh Valaee,et al.  Training Neural Networks with Very Little Data - A Draft , 2017, ArXiv.

[9]  K. Pokharel,et al.  A SPEA2 based planning framework for optimal integration of distributed generations , 2012, 2012 IEEE International Energy Conference and Exhibition (ENERGYCON).

[10]  C. L. Valenzuela A simple evolutionary algorithm for multi-objective optimization (SEAMO) , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[11]  David Silver,et al.  Monte-Carlo tree search and rapid action value estimation in computer Go , 2011, Artif. Intell..

[12]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[13]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[14]  Hai Tao Wang,et al.  Review of deep reinforcement learning and discussions on the development of computer Go , 2016 .

[15]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.