A Comprehensive Analysis of Accuracies of Machine Learning Algorithms for Network Intrusion Detection

Intrusion and anomaly detection are particularly important in the time of increased vulnerability in computer networks and communication. Therefore, this research aims to detect network intrusion with the highest accuracy and fastest time. To achieve this, nine supervised machine learning algorithms were first applied to the UNSW-NB15 dataset for network anomaly detection. In addition, different attacks are investigated with different mitigation techniques that help determine the types of attacks. Once detection was done, the feature set was reduced according to existing research work to increase the speed of the model without compromising accuracy. Furthermore, seven supervised machine learning algorithms were also applied to the newly released BoT-IoT dataset with around three million network flows. The results show that the Random Forest is the best in terms of accuracy (97.9121%) and Naive Bayes the fastest algorithm with 0.69 s for the UNSW-NB15 dataset. C4.5 is the most accurate one (87.66%), with all the features considered to identify the types of anomalies. For BoT-IoT, six of the seven algorithms have a close to 100% detection rate, except Naive Bayes.

[1]  Slobodan Petrovic,et al.  Towards a Generic Feature-Selection Measure for Intrusion Detection , 2010, 2010 20th International Conference on Pattern Recognition.

[2]  Dewan Md Farid,et al.  Feature selection and intrusion classification in NSL-KDD cup 99 dataset employing SVMs , 2014, The 8th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2014).

[3]  Ahmed Karmouch,et al.  Efficient resource allocation and dimensioning of media edge clouds infrastructure , 2017, Journal of Cloud Computing.

[4]  Shahrzad Zargari,et al.  Feature selection in UNSW-NB15 and KDDCUP'99 datasets , 2017, 2017 IEEE 26th International Symposium on Industrial Electronics (ISIE).

[5]  Jill Slay,et al.  Novel Geometric Area Analysis Technique for Anomaly Detection Using Trapezoidal Area Estimation on Large-Scale Networks , 2019, IEEE Transactions on Big Data.

[6]  Anamika Yadav,et al.  Increasing performance Of intrusion detection system using neural network , 2014, 2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies.

[7]  Surinder Singh Khurana,et al.  Comparison of classification techniques for intrusion detection dataset using WEKA , 2014, International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2014).

[8]  Anamika Yadav,et al.  Performance analysis of NSL-KDD dataset using ANN , 2015, 2015 International Conference on Signal Processing and Communication Engineering Systems.

[9]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[10]  Samuel A. Ajila,et al.  Using Machine Learning Algorithms for Cloud Client Prediction Models in a Web VM Resource Provisioning Environment , 2016 .

[11]  Qi Shi,et al.  A Deep Learning Approach to Network Intrusion Detection , 2018, IEEE Transactions on Emerging Topics in Computational Intelligence.

[12]  Hossein Gharaee,et al.  A new feature selection IDS based on genetic algorithm and SVM , 2016, 2016 8th International Symposium on Telecommunications (IST).

[13]  Remco R. Bouckaert,et al.  Bayesian Network Classifiers in Weka for Version 3-5-7 , 2007 .

[14]  Nerijus Paulauskas,et al.  Analysis of data pre-processing influence on intrusion detection using NSL-KDD dataset , 2017, 2017 Open Conference of Electrical, Electronic and Information Sciences (eStream).

[15]  Arafat Awajan,et al.  Experimental Evaluation of a Multi-layer Feed-Forward Artificial Neural Network Classifier for Network Intrusion Detection System , 2017, 2017 International Conference on New Trends in Computing Sciences (ICTCS).

[16]  Mohd Aizaini Maarof,et al.  Feature Selection Using Rough Set in Intrusion Detection , 2006, TENCON 2006 - 2006 IEEE Region 10 Conference.

[17]  W. Yassin,et al.  Intrusion detection based on K-Means clustering and Naïve Bayes classification , 2011, 2011 7th International Conference on Information Technology in Asia.

[18]  Chung-Horng Lung,et al.  An autonomic prediction suite for cloud resource provisioning , 2017, Journal of Cloud Computing.

[19]  Nour Moustafa,et al.  UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set) , 2015, 2015 Military Communications and Information Systems Conference (MilCIS).

[20]  Jill Slay,et al.  A hybrid feature selection for network intrusion detection systems: Central points , 2017, ArXiv.

[21]  Karim Afdel,et al.  DoS Detection Method based on Artificial Neural Networks , 2017 .

[22]  Slobodan Petrovic,et al.  Improving Effectiveness of Intrusion Detection by Correlation Feature Selection , 2010, ARES.

[23]  Jill Slay,et al.  The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set , 2016, Inf. Secur. J. A Glob. Perspect..

[24]  Elena Sitnikova,et al.  Towards the Development of Realistic Botnet Dataset in the Internet of Things for Network Forensic Analytics: Bot-IoT Dataset , 2018, Future Gener. Comput. Syst..