The Effective Methods for Intrusion Detection With Limited Network Attack Data: Multi-Task Learning and Oversampling

Recently, many anomaly intrusion detection algorithms have been developed and applied in network security. These algorithms achieve high detection rate on many classical datasets. However, most of them failed to address two challenges: 1) imbalanced traffic data with limited network attack, 2) multiple data sources that are distributed in different terminals. In detail, those algorithms assume that there are sufficient network traffic data to train their models for intrusion detection. Due to the network attack traffic is always scarce in the real-world network, this assumption is difficult to satisfy in most cases. In this paper, we use Multi-Task Learning (MTL) and oversampling methods to address those challenges of network intrusion detection. Firstly, we use the MTL method to treat each terminal as a single task, and then use relevant information between different terminals to help learn every single task. Meanwhile, we use the oversampling method to overcome the minority problem of attacks. Through a series of experiments on the latest UNSW-NB15 and CICIDS2018 datasets, this paper verifies the effectiveness of MTL and oversampling methods for network intrusion detection with limited network attack data, where they achieve more than 90% detection rate in different experimental settings.

[1]  S. Velliangiri,et al.  A hybrid BGWO with KPCA for intrusion detection , 2019, J. Exp. Theor. Artif. Intell..

[2]  Salvatore J. Stolfo,et al.  Adaptive Intrusion Detection: A Data Mining Approach , 2000, Artificial Intelligence Review.

[3]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[4]  Iman Almomani,et al.  Integrating Software Engineering Processes in the Development of Efficient Intrusion Detection Systems in Wireless Sensor Networks , 2020, Sensors.

[5]  Yoav Freund,et al.  Faster Boosting with Smaller Memory , 2019, NeurIPS.

[6]  Wenyi Huang,et al.  MtNet: A Multi-Task Neural Network for Dynamic Malware Classification , 2016, DIMVA.

[7]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[8]  Yun Zhou,et al.  Cyber Security Inference Based on a Two-Level Bayesian Network Framework , 2018, 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[9]  Hans-Peter Kriegel,et al.  On reverse-k-nearest-neighbor joins , 2015, GeoInformatica.

[10]  Ozgur Koray Sahingoz,et al.  Increasing the Performance of Machine Learning-Based IDSs on an Imbalanced and Up-to-Date Dataset , 2020, IEEE Access.

[11]  Gulshan Kumar An improved ensemble approach for effective intrusion detection , 2019, The Journal of Supercomputing.

[12]  Hongpo Zhang,et al.  An effective convolutional neural network based on SMOTE and Gaussian mixture model for intrusion detection in imbalanced dataset , 2020, Comput. Networks.

[13]  Zenglin Xu,et al.  Balanced self-paced learning with feature corruption , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[14]  Yun Zhou,et al.  FSPMTL: Flexible Self-Paced Multi-Task Learning , 2020, IEEE Access.

[15]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[16]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[17]  Shuokang Huang,et al.  IGAN-IDS: An imbalanced generative adversarial network towards intrusion detection system in ad-hoc networks , 2020, Ad Hoc Networks.

[18]  Ali A. Ghorbani,et al.  Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization , 2018, ICISSP.

[19]  Huyin Zhang,et al.  Network Intrusion Detection Based on PSO-Xgboost Model , 2020, IEEE Access.

[20]  Jaime G. Carbonell,et al.  Self-Paced Multitask Learning with Shared Knowledge , 2017, IJCAI.

[21]  Ji Wang,et al.  An Intrusion Detection Model With Hierarchical Attention Mechanism , 2020, IEEE Access.

[22]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[23]  Hal Daumé,et al.  Learning Task Grouping and Overlap in Multi-task Learning , 2012, ICML.

[24]  Chi-Hyuck Jun,et al.  Variable Selection and Task Grouping for Multi-Task Learning , 2018, KDD.

[25]  David Mease,et al.  Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers , 2015, J. Mach. Learn. Res..

[26]  Bo Li,et al.  Multi-Task Learning for Intrusion Detection on web logs , 2017, J. Syst. Archit..

[27]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[28]  He Huang,et al.  Automatic Multi-task Learning System for Abnormal Network Traffic Detection , 2018, Int. J. Emerg. Technol. Learn..

[29]  Nour Moustafa,et al.  UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set) , 2015, 2015 Military Communications and Information Systems Conference (MilCIS).

[30]  Dong Hoon Lee,et al.  AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection , 2020, IEEE Access.

[31]  Celestine Iwendi,et al.  The Use of Ensemble Models for Multiple Class and Binary Class Classification for Improving Intrusion Detection Systems , 2020, Sensors.

[32]  Halit Alptekin,et al.  Intrusion Detection Over Encrypted Network Data , 2020, Comput. J..

[33]  Yingwei Yu,et al.  An Intrusion Detection Method Using Few-Shot Learning , 2020, IEEE Access.

[34]  Konstantinos Demertzis,et al.  Cyber-Typhon: An Online Multi-task Anomaly Detection Framework , 2019, AIAI.

[35]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[36]  Iman Nekooeimehr,et al.  Adaptive semi-unsupervised weighted oversampling (A-SUWO) for imbalanced datasets , 2016, Expert Syst. Appl..

[37]  Yun Zhou,et al.  An ensemble learning approach for XSS attack detection with domain knowledge and threat intelligence , 2019, Comput. Secur..