Features Dimensionality Reduction Approaches for Machine Learning Based Network Intrusion Detection

The security of networked systems has become a critical universal issue that influences individuals, enterprises and governments. The rate of attacks against networked systems has increased dramatically, and the tactics used by the attackers are continuing to evolve. Intrusion detection is one of the solutions against these attacks. A common and effective approach for designing Intrusion Detection Systems (IDS) is Machine Learning. The performance of an IDS is significantly improved when the features are more discriminative and representative. This study uses two feature dimensionality reduction approaches: (i) Auto-Encoder (AE): an instance of deep learning, for dimensionality reduction, and (ii) Principle Component Analysis (PCA). The resulting low-dimensional features from both techniques are then used to build various classifiers such as Random Forest (RF), Bayesian Network, Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA) for designing an IDS. The experimental findings with low-dimensional features in binary and multi-class classification show better performance in terms of Detection Rate (DR), F-Measure, False Alarm Rate (FAR), and Accuracy. This research effort is able to reduce the CICIDS2017 dataset’s feature dimensions from 81 to 10, while maintaining a high accuracy of 99.6% in multi-class and binary classification. Furthermore, in this paper, we propose a Multi-Class Combined performance metric C o m b i n e d M c with respect to class distribution to compare various multi-class and binary classification systems through incorporating FAR, DR, Accuracy, and class distribution parameters. In addition, we developed a uniform distribution based balancing approach to handle the imbalanced distribution of the minority class instances in the CICIDS2017 network intrusion dataset.

[1]  Paul Bertens,et al.  Rank Ordered Autoencoders , 2016, ArXiv.

[2]  Yonghui Song,et al.  Mechanism of situation element acquisition based on deep auto-encoder network in wireless sensor networks , 2017, Int. J. Distributed Sens. Networks.

[3]  Jose Miguel Puerta,et al.  Improving the performance of Naive Bayes multinomial in e-mail foldering by introducing distribution-based balance of datasets , 2011, Expert Syst. Appl..

[4]  Ailton Akira Shinoda,et al.  A dataset for evaluating intrusion detection systems in IEEE 802.11 wireless networks , 2014, 2014 IEEE Colombian Conference on Communications and Computing (COLCOM).

[5]  Takehisa Yairi,et al.  Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction , 2014, MLSDA'14.

[6]  Stefan C. Kremer,et al.  Network intrusion detection system based on recursive feature addition and bigram technique , 2018, Comput. Secur..

[7]  Stephen D. Bay,et al.  The UCI KDD archive of large data sets for data mining research and experimentation , 2000, SKDD.

[8]  Shie-Jue Lee,et al.  Machine learning based network intrusion detection , 2017, 2017 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA).

[9]  Pavel Loskot,et al.  Common Metrics for Analyzing, Developing and Managing Telecommunication Networks , 2017, ArXiv.

[10]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[11]  R. Vijayanand,et al.  Intrusion detection system for wireless mesh network using multiple support vector machine classifiers with genetic-algorithm-based feature selection , 2018, Comput. Secur..

[12]  Zeynep Turgut,et al.  Intrusion Detection System with Recursive Feature Elimination by Using Random Forest and Deep Learning Classifier , 2018, 2018 International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism (IBIGDELFT).

[13]  Ashu Bansal DDR Scheme and LSTM RNN Algorithm for Building an Efficient IDS , 2018 .

[14]  Alberto D. Pascual-Montano,et al.  A survey of dimensionality reduction techniques , 2014, ArXiv.

[15]  김광조,et al.  Detecting Active Attacks in WiFi Network by Semi-supervised Deep Learning , 2016 .

[16]  Ali A. Ghorbani,et al.  Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization , 2018, ICISSP.

[17]  Tülin Atmaca,et al.  Intrusion Detection with Comparative Analysis of Supervised Learning Techniques and Fisher Score Feature Selection Algorithm , 2018, ISCIS.

[18]  Kwangjo Kim,et al.  Improving Detection of Wi-Fi Impersonation by Fully Unsupervised Deep Learning , 2017, WISA.

[19]  Qiang Liu,et al.  SU-IDS: A Semi-supervised and Unsupervised Framework for Network Intrusion Detection , 2018, ICCCS.

[20]  Ali A. Ghorbani,et al.  Toward developing a systematic approach to generate benchmark datasets for intrusion detection , 2012, Comput. Secur..

[21]  Dhavleesh Rattan,et al.  An Analysis of Mechanisms for Making IDS Fault Tolerant , 2010 .

[22]  B. Surendiran,et al.  Dimensionality reduction using Principal Component Analysis for network intrusion detection , 2016 .

[23]  Gavin Watson,et al.  A Comparison of Header and Deep Packet Features when Detecting Network Intrusions , 2018 .

[24]  Bingyang Li,et al.  Distributed Abnormal Behavior Detection Approach Based on Deep Belief Network and Ensemble SVM Using Spark , 2018, IEEE Access.

[25]  Bartley D. Richardson,et al.  Sequence Aggregation Rules for Anomaly Detection in Computer Network Traffic , 2018, ArXiv.

[26]  Miad Faezipour,et al.  Deep and Machine Learning Approaches for Anomaly-Based Intrusion Detection of Imbalanced Network Traffic , 2019, IEEE Sensors Letters.

[27]  Kwangjo Kim,et al.  Deep Abstraction and Weighted Feature Selection for Wi-Fi Impersonation Detection , 2018, IEEE Transactions on Information Forensics and Security.

[28]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[29]  Kurt Hornik,et al.  ctree : Conditional Inference Trees , 2015 .

[30]  Chun-Gui Li,et al.  Intrusion Detection System Based on Principal Component Analysis and Grey Neural Networks , 2010, 2010 Second International Conference on Networks Security, Wireless Communications and Trusted Computing.

[31]  José Salvador Sánchez,et al.  On the effectiveness of preprocessing methods when dealing with different levels of class imbalance , 2012, Knowl. Based Syst..

[32]  Dogukan Aksu,et al.  Detecting Port Scan Attempts with Comparative Analysis of Deep Learning and Support Vector Machine Algorithms , 2018, 2018 International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism (IBIGDELFT).

[33]  Peter D. Zegzhda,et al.  Wavelet-analysis of network traffic time-series for detection of attacks on digital production infrastructure , 2018 .

[34]  Georgios Kambourakis,et al.  Intrusion Detection in 802.11 Networks: Empirical Evaluation of Threats and a Public Dataset , 2016, IEEE Communications Surveys & Tutorials.

[35]  Alysson Neves Bessani,et al.  A Resilient Stream Learning Intrusion Detection Mechanism for Real-Time Analysis of Network Traffic , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[36]  I K Fodor,et al.  A Survey of Dimension Reduction Techniques , 2002 .

[37]  Ridwan Nur Wibowo,et al.  NSL-KDD Dataset , 2019 .

[38]  N. F. F. Ebecken,et al.  On extending F-measure and G-mean metrics to multi-class problems , 2005, Data Mining VI.

[39]  Sanmeet Kaur,et al.  Extreme Gradient Boosting Based Tuning for Classification in Intrusion Detection Systems , 2018 .

[40]  Mariana Belgiu,et al.  Random forest in remote sensing: A review of applications and future directions , 2016 .

[41]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[42]  Gilles Louppe,et al.  Understanding Random Forests: From Theory to Practice , 2014, 1407.7502.

[43]  Andreas Hotho,et al.  Flow-based benchmark data sets for intrusion detection , 2017 .

[44]  Yuval Elovici,et al.  Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection , 2018, NDSS.

[45]  Mansoor Alam,et al.  A Deep Learning Approach for Network Intrusion Detection System , 2016, EAI Endorsed Trans. Security Safety.

[46]  Alireza Makhzani,et al.  Unsupervised Representation Learning with Autoencoders , 2018 .

[47]  Sushil Jajodia,et al.  Recognizing Unexplained Behavior in Network Traffic , 2014, Network Science and Cybersecurity.

[48]  Yu Lasheng,et al.  Deep Learning Approach Combining Sparse Autoencoder With SVM for Network Intrusion Detection , 2018, IEEE Access.

[49]  I. Johnstone,et al.  Sparse Principal Components Analysis , 2009, 0901.4392.