A New Smart Router-Throttling Method to Mitigate DDoS Attacks

The distributed denial of service (DDoS) attack is one of the most server threats to the current Internet and brings huge losses to society. Furthermore, it is challenging to defend DDoS due to the case that the DDoS traffic can appear similar to the legitimate ones. Router throttling is an accessible approach to defend DDoS attacks. Some existing router throttling methods dynamically adjust a given threshold value to keep the server load safe. However, these methods are not ideal as they exploit the information of the current time, so the perception of time series variations is poor. The DDoS problem can be seen as a Markov decision process (MDP). Multi-agent router throttling (MART) method based on hierarchical communication mechanism has been proposed to address this problem. However, each agent is independent with each other and has no communication among them, therefore, it is hard for them to collaborate to learn an ideal policy to defend DDoS. To solve this multi-agent partially observable MDP problem, we propose a centralized reinforcement learning router throttling method based on a centralized communication mechanism. Each router sends its own traffic reading to a central router, the central router then makes a decision for each router to choose the throttling rate. We also simulate the environment of the DDoS problem more realistic while modify the reward function of the MART to make the reward function of more coherent. To decrease the communication costs, we add a deep deterministic policy gradient network for each router to decide whether or not to send information to the central agent. The experiments validate that our proposed new smart router throttling method outperforms existing methods to the DDoS instruction response.

[1]  François Schwarzentruber,et al.  Knowledge-Based Policies for Qualitative Decentralized POMDPs , 2018, AAAI.

[2]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[3]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[4]  Yuefei Zhu,et al.  A Deep Learning Approach for Intrusion Detection Using Recurrent Neural Networks , 2017, IEEE Access.

[5]  C. L. Philip Chen,et al.  Optimized Multi-Agent Formation Control Based on an Identifier–Actor–Critic Reinforcement Learning Algorithm , 2018, IEEE Transactions on Fuzzy Systems.

[6]  Ruslan Salakhutdinov,et al.  Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations , 2016, NIPS.

[7]  Peter Stone,et al.  Source Task Creation for Curriculum Learning , 2016, AAMAS.

[8]  Tommaso Melodia,et al.  Securing the Internet of Things in the Age of Machine Learning and Software-Defined Networking , 2018, IEEE Internet of Things Journal.

[9]  Elisa Bertino,et al.  Botnets and Internet of Things Security , 2017, Computer.

[10]  Aikaterini Mitrokotsa,et al.  DDoS attacks and defense mechanisms: classification and state-of-the-art , 2004, Comput. Networks.

[11]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[12]  Konstantin Eckle,et al.  A comparison of deep networks with ReLU activation function and linear spline-type methods , 2018, Neural Networks.

[13]  Ling Nie,et al.  Simulation and analysis of campus network based on OPNET , 2019, J. Comput. Methods Sci. Eng..

[14]  Yisong Yue,et al.  Coordinated Multi-Agent Imitation Learning , 2017, ICML.

[15]  Allison Koenecke,et al.  Curriculum Learning in Deep Neural Networks for Financial Forecasting , 2019, MIDAS@PKDD.

[16]  Jelena Mirkovic,et al.  Alliance formation for DDoS defense , 2003, NSPW '03.

[17]  Daniel Kudenko,et al.  Multiagent Router Throttling: Decentralized Coordinated Response Against DDoS Attacks , 2013, IAAI.

[18]  Jonathan P. How,et al.  Decentralized control of multi-robot partially observable Markov decision processes using belief space macro-actions , 2017, Int. J. Robotics Res..

[19]  Feng Wu,et al.  Monte-Carlo Expectation Maximization for Decentralized POMDPs , 2013, IJCAI.

[20]  Vijaykumar Gullapalli,et al.  A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.

[21]  Frits de Nijs,et al.  Resource-constrained Multi-agent Markov Decision Processes , 2019 .

[22]  Guanghong Gong,et al.  Automated Recognition of Epileptic EEG States Using a Combination of Symlet Wavelet Processing, Gradient Boosting Machine, and Grid Search Optimizer , 2018, Sensors.

[23]  Jelena Mirkovic,et al.  Attacking DDoS at the source , 2002, 10th IEEE International Conference on Network Protocols, 2002. Proceedings..

[24]  Rong Jin,et al.  Online AUC Maximization , 2011, ICML.

[25]  Peter Reiher,et al.  A taxonomy of DDoS attack and DDoS defense mechanisms , 2004, CCRV.

[26]  Nikolaos G. Paterakis,et al.  Automated Negotiations Under User Preference Uncertainty: A Linear Programming Approach , 2018, AT.

[27]  Vijay R. Konda,et al.  OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..

[28]  Vivek S. Borkar,et al.  An actor-critic algorithm for constrained Markov decision processes , 2005, Syst. Control. Lett..

[29]  David K. Y. Yau,et al.  Defending against distributed denial-of-service attacks with max-min fair server-centric router throttles , 2005, IEEE/ACM Transactions on Networking.

[30]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[31]  W. J. Blackert,et al.  Analyzing interaction between distributed denial of service attacks and mitigation technologies , 2003, Proceedings DARPA Information Survivability Conference and Exposition.

[32]  Zhi Chen,et al.  Intelligent Power Control for Spectrum Sharing in Cognitive Radios: A Deep Reinforcement Learning Approach , 2017, IEEE Access.

[33]  Noe Casas,et al.  Deep Deterministic Policy Gradient for Urban Traffic Light Control , 2017, ArXiv.

[34]  Mohsen Afsharchi,et al.  A Markovian Decision Process Analysis of Experienced Agents Joining Ad-Hoc Teams , 2018, 2018 21st Euromicro Conference on Digital System Design (DSD).

[35]  Georgios Kambourakis,et al.  DDoS in the IoT: Mirai and Other Botnets , 2017, Computer.

[36]  Zongqing Lu,et al.  Learning Attentional Communication for Multi-Agent Cooperation , 2018, NeurIPS.

[37]  Xiangwei Bu,et al.  Actor-Critic Reinforcement Learning Control of Non-Strict Feedback Nonaffine Dynamic Systems , 2019, IEEE Access.

[38]  Daniel Kudenko,et al.  Distributed response to network intrusions using multiagent reinforcement learning , 2015, Eng. Appl. Artif. Intell..

[39]  C. R. Bector,et al.  A note on "Generalized fuzzy linear programming for decision making under uncertainty: Feasibility of fuzzy solutions and solving approach" , 2014, Inf. Sci..

[40]  Xinjie Chang Network simulations with OPNET , 1999, WSC'99. 1999 Winter Simulation Conference Proceedings. 'Simulation - A Bridge to the Future' (Cat. No.99CH37038).

[41]  S. Buckley,et al.  Hospitalization admission control of emergency patients using markovian decision processes and discrete event simulation , 2018 .

[42]  Robert Babuska,et al.  A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[43]  Pinaki Mazumder,et al.  Hardware-Friendly Actor-Critic Reinforcement Learning Through Modulation of Spike-Timing-Dependent Plasticity , 2017, IEEE Transactions on Computers.

[44]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.