The Significant Features of the UNSW-NB15 and the KDD99 Data Sets for Network Intrusion Detection Systems

Because of the increase flow of network traffic and its significance to the provision of ubiquitous services, cyberattacks attempt to compromise the security principles of confidentiality, integrity and availability. A Network Intrusion Detection System (NIDS) monitors and detects cyber-attack patterns over networking environments. Network packets consist of a wide variety of features which negatively affects detection of anomalies. These features include some irrelevant or redundant features which reduce the efficiency of detecting attacks, and increase False Alarm Rate (FAR). In this paper, the feature characteristics of the UNSW-NB15 and KDD99 datasets are examined, and the features of the UNSW-NB15 are replicated to the KDD99 data set to measure their effeciency. We apply An Association Rule Mining algorithm as feature selection to generate the strongest features from the two data sets. Some existing classifiers are utilised to evaluate the complexity in terms of accuracy and FAR. The experimental results show that, the original KDD99 attributes are less efficient than the replicated UNSW-NB15 attributes of the KDD99 data set. However, comparing the two data sets, the accuracy of the KDD99 dataset is better than the UNSW-NB 15 dataset, and the FAR of the KDD99 dataset is lower the UNSWNB 15 dataset.

[1]  John McHugh,et al.  Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory , 2000, TSEC.

[2]  Gabriel Maciá-Fernández,et al.  Anomaly-based network intrusion detection: Techniques, systems and challenges , 2009, Comput. Secur..

[3]  Jugal K. Kalita,et al.  Network Anomaly Detection: Methods, Systems and Tools , 2014, IEEE Communications Surveys & Tutorials.

[4]  Giovanni Vigna,et al.  NetSTAT: A Network-based Intrusion Detection System , 1999, J. Comput. Secur..

[5]  B. Nath,et al.  Dimensionality Reduction for Association Rule Mining , 2011 .

[6]  Salvatore J. Stolfo,et al.  A data mining framework for building intrusion detection models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[7]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[8]  Jill Slay,et al.  The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set , 2016, Inf. Secur. J. A Glob. Perspect..

[9]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.

[10]  R. Amornchewin,et al.  Probability-Based Incremental Association Rule Discovery Algorithm , 2008, International Symposium on Computer Science and its Applications.

[11]  Ming-Yang Su,et al.  A real-time network intrusion detection system based on incremental mining approach , 2008, 2008 IEEE International Conference on Intelligence and Security Informatics.

[12]  Nour Moustafa,et al.  UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set) , 2015, 2015 Military Communications and Information Systems Conference (MilCIS).

[13]  Shichao Zhang,et al.  Association Rule Mining: Models and Algorithms , 2002 .

[14]  Lizhu Zhou,et al.  Integrating Classification and Association Rule Mining: A Concept Lattice Framework , 1999, RSFDGrC.

[15]  Susan M. Bridges,et al.  Mining fuzzy association rules and fuzzy frequency episodes for intrusion detection , 2000, Int. J. Intell. Syst..

[16]  U. Fayyad,et al.  Scaling EM (Expectation Maximization) Clustering to Large Databases , 1998 .

[17]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[18]  Philip K. Chan,et al.  An Analysis of the 1999 DARPA/Lincoln Laboratory Evaluation Data for Network Anomaly Detection , 2003, RAID.

[19]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[20]  Manas Ranjan Patra,et al.  NETWORK INTRUSION DETECTION USING NAÏVE BAYES , 2007 .

[21]  Aboul Ella Hassanien,et al.  Continuous Features Discretization for Anomaly Intrusion Detectors Generation , 2014, ArXiv.

[22]  Yao Yuan,et al.  Study of database intrusion detection based on improved association rule algorithm , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[23]  Ming-Yang Su,et al.  Feature Weighting and Selection for a Real-Time Network Intrusion Detection System Based on GA with KNN , 2008, ISI Workshops.

[24]  S. Selvakumar,et al.  SSENet-2011: A Network Intrusion Detection System dataset and its comparison with KDD CUP 99 dataset , 2011, 2011 Second Asian Himalayas International Conference on Internet (AH-ICI).