Combining Unsupervised Approaches for Near Real-Time Network Traffic Anomaly Detection

The 0-day attack is a cyber-attack based on vulnerabilities that have not yet been published. The detection of anomalous traffic generated by such attacks is vital, as it can represent a critical problem, both in a technical and economic sense, for a smart enterprise as for any system largely dependent on technology. To predict this kind of attack, one solution can be to use unsupervised machine learning approaches, as they guarantee the detection of anomalies regardless of their prior knowledge. It is also essential to identify the anomalous and unknown behaviors that occur within a network in near real-time. Three different approaches have been proposed and benchmarked in exactly the same condition: Deep Autoencoding with GMM and Isolation Forest, Deep Autoencoder with Isolation Forest, and Memory Augmented Deep Autoencoder with Isolation Forest. These approaches are thus the result of combining different unsupervised algorithms. The results show that the addition of the Isolation Forest improves the accuracy values and increases the inference time, although this increase does not represent a relevant problematic factor. This paper also explains the features that the various models consider most important for classifying an event as an attack using the explainable artificial intelligence methodology called Shapley Additive Explanations (SHAP). Experiments were conducted on KDD99, NSL-KDD, and CIC-IDS2017 datasets.

[1]  David Cortes,et al.  Revisiting randomized choices in isolation forests , 2021, ArXiv.

[2]  Fuad A. Ghaleb,et al.  Anomaly-Based Intrusion Detection Systems in IoT Using Deep Learning: A Systematic Literature Review , 2021, Applied Sciences.

[3]  Dohyeun Kim,et al.  An Ensemble of Prediction and Learning Mechanism for Improving Accuracy of Anomaly Detection in Network Intrusion Environments , 2021, Sustainability.

[4]  Rosilah Hassan,et al.  Anomaly Detection Using Deep Neural Network for IoT Architecture , 2021, Applied Sciences.

[5]  Chin-Wei Tien,et al.  Using Autoencoders for Anomaly Detection and Transfer Learning in IoT , 2021, Comput..

[6]  Mohamed Abdel-Basset,et al.  Semi-Supervised Spatiotemporal Deep Learning for Intrusions Detection in IoT Networks , 2021, IEEE Internet of Things Journal.

[7]  Yan Xu,et al.  Leveraging Semisupervised Hierarchical Stacking Temporal Convolutional Network for Anomaly Detection in IoT Communication , 2021, IEEE Internet of Things Journal.

[8]  Robert J. Brunner,et al.  Extended Isolation Forest , 2018, IEEE Transactions on Knowledge and Data Engineering.

[9]  Giuseppe Pirlo,et al.  Ensemble Consensus: An Unsupervised Algorithm for Anomaly Detection in Network Security data , 2021, ITASEC.

[10]  Isabel Praça,et al.  Intelligent Cyber Attack Detection and Classification for Network-Based Intrusion Detection Systems , 2020, Applied Sciences.

[11]  Eryk Dutkiewicz,et al.  Deep Transfer Learning for IoT Attack Detection , 2020, IEEE Access.

[12]  Javier Bermejo Higuera,et al.  Prevention and Fighting against Web Attacks through Anomaly Detection Technology. A Systematic Review , 2020, Sustainability.

[13]  Yingying Xu,et al.  Intrusion Detection Based on Fusing Deep Neural Networks and Transfer Learning , 2019, IFTC.

[14]  Svetha Venkatesh,et al.  Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Ridwan Nur Wibowo,et al.  NSL-KDD Dataset , 2019 .

[16]  John Yen,et al.  Using Bayesian Networks for Probabilistic Identification of Zero-Day Attack Paths , 2018, IEEE Transactions on Information Forensics and Security.

[17]  Yuval Elovici,et al.  Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection , 2018, NDSS.

[18]  Bo Zong,et al.  Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection , 2018, ICLR.

[19]  Samarjeet Borah,et al.  A detailed analysis of CICIDS2017 dataset for designing Intrusion Detection Systems , 2018 .

[20]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[21]  Zhaohui Wu,et al.  Discovering different kinds of smartphone users through their application usage behaviors , 2016, UbiComp.

[22]  Yu Cheng,et al.  Deep Structured Energy Based Models for Anomaly Detection , 2016, ICML.

[23]  S. P. Shantharajah,et al.  A Study on NSL-KDD Dataset for Intrusion Detection System Based on Classification Algorithms , 2015 .

[24]  Erik Strumbelj,et al.  Explaining prediction models and individual predictions with feature contributions , 2014, Knowledge and Information Systems.

[25]  José Antonio Lozano,et al.  Sensitivity Analysis of k-Fold Cross Validation in Prediction Error Estimation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[27]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[28]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[29]  Wenjie Hu,et al.  Robust Anomaly Detection Using Support Vector Machines , 2003 .

[30]  Guoqiang Peter Zhang,et al.  Neural networks for classification: a survey , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[31]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[32]  Katrien van Driessen,et al.  A Fast Algorithm for the Minimum Covariance Determinant Estimator , 1999, Technometrics.