Generative adversarial networks have been able to generate striking results in various domains. This generation capability can be general while the networks gain deep understanding regarding the data distribution. In many domains, this data distribution consists of anomalies and normal data, with the anomalies commonly occurring relatively less, creating datasets that are imbalanced. The capabilities that generative adversarial networks offer can be leveraged to examine these anomalies and help alleviate the challenge that imbalanced datasets propose via creating synthetic anomalies. This anomaly generation can be specifically beneficial in domains that have costly data creation processes as well as inherently imbalanced datasets. One of the domains that fits this description is the host-based intrusion detection domain. In this work, ADFA-LD dataset is chosen as the dataset of interest containing system calls of small foot-print next generation attacks. The data is first converted into images, and then a Cycle-GAN is used to create images of anomalous data from images of normal data. The generated data is combined with the original dataset and is used to train a model to detect anomalies. By doing so, it is shown that the classification results are improved, with the AUC rising from 0.55 to 0.71, and the anomaly detection rate rising from 17.07% to 80.49%. The results are also compared to SMOTE, showing the potential presented by generative adversarial networks in anomaly generation.
[1]
Yoshua Bengio,et al.
Generative Adversarial Networks
,
2014,
ArXiv.
[2]
Jiankun Hu,et al.
Generation of a new IDS test dataset: Time to retire the KDD collection
,
2013,
2013 IEEE Wireless Communications and Networking Conference (WCNC).
[3]
拓海 杉山,et al.
“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告
,
2017
.
[4]
Jiankun Hu,et al.
Host-Based Anomaly Intrusion Detection
,
2010,
Handbook of Information and Communication Security.
[5]
Alfredo De Santis,et al.
Using generative adversarial networks for improving classification effectiveness in credit card fraud detection
,
2017,
Inf. Sci..
[6]
Gideon Creech,et al.
Developing a high-accuracy cross platform Host-Based Intrusion Detection System capable of reliably detecting zero-day attacks
,
2014
.
[7]
Xinghuo Yu,et al.
Integer Data Zero-Watermark Assisted System Calls Abstraction and Normalization for Host Based Anomaly Detection Systems
,
2015,
2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing.
[8]
Fernando Bação,et al.
Effective data generation for imbalanced learning using conditional generative adversarial networks
,
2018,
Expert Syst. Appl..