Generating Labeled Flow Data from MAWILab Traces for Network Intrusion Detection

A growing issue in the modern cyberspace world is the direct identification of malicious activity over network connections. The boom of the machine learning industry in the past few years has led to the increasing usage of machine learning technologies, which are especially prevalent in the network intrusion detection research community. When utilizing these fairly contemporary techniques, the community has realized that datasets are pivotal for identifying malicious packets and connections, particularly ones associated with information concerning labeling in order to construct learning models. However, there exists a shortage of publicly available, relevant datasets to researchers in the network intrusion detection community. Thus, in this paper, we introduce a method to construct labeled flow data by combining the packet meta-information with IDS logs to infer labels for intrusion detection research. Specifically, we designed a NetFlow-compatible format due to the capability of a a large body of network devices, such as routers and switches, to export NetFlow records from raw traffic. In doing so, the introduced method at hand would aid researchers to access relevant network flow datasets along with label information.

[1]  Jinoh Kim,et al.  An approach to online network monitoring using clustered patterns , 2017, 2017 International Conference on Computing, Networking and Communications (ICNC).

[2]  Kensuke Fukuda,et al.  A taxonomy of anomalies in backbone network traffic , 2014, 2014 International Wireless Communications and Mobile Computing Conference (IWCMC).

[3]  Ali A. Ghorbani,et al.  Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization , 2018, ICISSP.

[4]  Nour Moustafa,et al.  UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set) , 2015, 2015 Military Communications and Information Systems Conference (MilCIS).

[5]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[6]  Christian Callegari,et al.  Statistical Network Anomaly Detection: An Experimental Study , 2016, FNSS.

[7]  Kensuke Fukuda,et al.  Visual comparison of network anomaly detectors with chord diagrams , 2014, SAC.

[8]  Mohiuddin Ahmed,et al.  A survey of network anomaly detection techniques , 2016, J. Netw. Comput. Appl..

[9]  EMMANOUIL VASILOMANOLAKIS,et al.  Taxonomy and Survey of Collaborative Intrusion Detection , 2015, ACM Comput. Surv..

[10]  Lalu Banoth,et al.  A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection , 2017 .

[11]  Kensuke Fukuda,et al.  MAWILab: combining diverse anomaly detectors for automated anomaly labeling and performance benchmarking , 2010, CoNEXT.