论文信息 - DeepADMR: A Deep Learning based Anomaly Detection for MANET Routing

DeepADMR: A Deep Learning based Anomaly Detection for MANET Routing

We developed DeepADMR, a novel neural anomaly detector for the deep reinforcement learning (DRL)-based DeepCQ+ MANET routing policy. The performance of DRL-based algorithms such as DeepCQ+ is only verified within the trained and tested environments, hence their deployment in the tactical domain induces high risks. DeepADMR monitors unexpected behavior of the DeepCQ+ policy based on the temporal difference errors (TD-errors) in real-time and detects anomaly scenarios with empirical and non-parametric cumulative-sum statistics. The DeepCQ+ design via multi-agent weight-sharing proximal policy optimization (PPO) is slightly modified to enable the real-time estimation of the TD-errors. We report the DeepADMR performance in the presence of channel disruptions, high mobility levels, and network sizes beyond the training environments, which shows its effectiveness.

Alex Yahja | Saeed Kaviani | Bo Ryu | Jae H. Kim | Kevin Larson

[1] Young-Bae Ko,et al. Trust-Based Intelligent Routing Protocol with Q-Learning for Mission-Critical Wireless Sensor Networks , 2022, Sensors.

[2] Saeed Kaviani,et al. DeepCQ+: Robust and Scalable Routing with Multi-Agent Deep Reinforcement Learning for Highly Dynamic Networks , 2021, MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM).

[3] Pramod K. Varshney,et al. A Scalable Algorithm for Anomaly Detection via Learning-Based Controlled Sensing , 2021, ICC 2021 - IEEE International Conference on Communications.

[4] Saeed Kaviani,et al. Robust and Scalable Routing with Multi-Agent Deep Reinforcement Learning for MANETs , 2021, ArXiv.

[5] Dinesh Manocha,et al. Parameter Sharing is Surprisingly Useful for Multi-Agent Deep Reinforcement Learning. , 2020 .

[6] Lang Tong,et al. Universal Data Anomaly Detection via Inverse Generative Adversary Network , 2020, IEEE Signal Processing Letters.

[7] D. Hassabis,et al. A distributional code for value in dopamine-based reinforcement learning , 2020, Nature.

[8] Mehmet Necip Kurt,et al. Sequential Model-Free Anomaly Detection for Big Data Streams , 2019, 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[9] Kevin Larson,et al. A Reinforcement Learning Approach to Adaptive Redundancy for Routing in Tactical Networks , 2018, MILCOM 2018 - 2018 IEEE Military Communications Conference (MILCOM).

[10] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[11] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[12] Patrick M. Pilarski,et al. True Online Temporal-Difference Learning , 2015, J. Mach. Learn. Res..

[13] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.

[14] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[15] Jason Weston,et al. Curriculum learning , 2009, ICML '09.

[16] C. Sathitwiriyawong,et al. A Comparative Study of Random Waypoint and Gauss-Markov Mobility Models in the Performance Evaluation of MANET , 2006, 2006 International Symposium on Communications and Information Technologies.

[17] Risto Miikkulainen,et al. Confidence-based Q-Routing: An on-line adaptive network routing algorithm , 1998 .

[18] Michael L. Littman,et al. Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.

[19] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[20] R. Fergus,et al. Automatic Data Augmentation for Generalization in Reinforcement Learning , 2021, Neural Information Processing Systems.

[21] Min Tong,et al. Intelligent Routing Control for MANET Based on Reinforcement Learning , 2018 .

[22] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.