Network Abnormal Traffic Detection Model Based on Semi-Supervised Deep Reinforcement Learning

The rapid development of Internet technology has brought great convenience to our production life, and the ensuing security problems have become increasingly prominent. These problems threaten users’ privacy and pose significant security risks to the normal conduct of many aspects of society, such as politics, economy, culture, and people’s livelihood. The growth of the information transmission rate expands the scope of attacks and provides a more attack environment for intruders. Abnormal detection is an effective security protection technology that can monitor network transmission in real-time, effectively sense external attacks, and provide response decisions for relevant managers. The development of machine learning has also led to the development of abnormal traffic detection technology. The goal has been to use powerful and fast learning algorithms to deal with changing threats and respond in real-time. Most of the current abnormal detection research is based on simulation, using public and well-known datasets. On the one hand, the dataset contains high-dimensional massive data, which traditional machine learning methods cannot be processed. On the other hand, the labeled data scale is far behind the application requirements, and the dataset’s labels are all manually labeled, so the labeling cost is exceptionally high. This paper proposes a semi-supervised Double Deep Q-Network (SSDDQN)-based optimization method for network abnormal traffic detection, mainly based on Double Deep Q-Network (DDQN), a representative of Deep Reinforcement Learning algorithm. In SSDDQN, the current network first adopts the autoencoder to reconstruct the traffic features and then uses a deep neural network as a classifier. The target network first uses the unsupervised learning algorithm K-Means clustering and then uses deep neural network prediction. The experiment uses NSL-KDD and AWID datasets for training and testing and performs a comprehensive comparison with existing machine learning models. The experimental results show that SSDDQN has certain advantages in time complexity and achieved good results in various evaluation metrics.