Data Augmentation for Insider Threat Detection with GAN

In insider threat detection domain, the datasets are highly imbalanced, where the number of user's normal behavior is higher than that of insider's anomalous behavior. A direct approach to handle the class imbalance problem is using data augmentation on the minority class. Existing data augmentation methods mainly produce synthetic samples according with the linear operation based on samples of the minority class. Hence, these methods just focus on local information which leads to the unitarily of the synthetic samples, resulting in overfitting. To enrich the diversity of the synthetic samples, we propose a deep adversarial insider threat detection (DAITD) framework using the Generative Adversarial Networks (GAN) to approximate the true anomalous behavior distribution. Specifically, we first obtain anomalous user behavior representations from the anomalous behavior data (minority class), and then use the generator of the GAN to model the actual anomalous behavior distribution, use the discriminator of the GAN to distinguish whether the synthetic sample from the generator is real or not. In this way, our method is able to generate high quality synthetic samples that are close to the anomalous user behavior. Experimental results show that the DAITD framework outperforms other comparative inside threat detection algorithms.

[1]  Dawn M. Cappelli,et al.  Insider Threat Study: Computer System Sabotage in Critical Infrastructure Sectors , 2005 .

[2]  Steven Salzberg,et al.  Programs for Machine Learning , 2004 .

[3]  Joshua Glasser,et al.  Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data , 2013, 2013 IEEE Security and Privacy Workshops.

[4]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[5]  Yuval Elovici,et al.  Insight into Insiders: A Survey of Insider Threat Taxonomies, Analysis, Modeling, and Countermeasures , 2018, ArXiv.

[6]  Rui Zhang,et al.  Detecting Insider Threat Based on Document Access Behavior Analysis , 2014, APWeb Workshophs.

[7]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Mudita Singhal,et al.  Supervised and Unsupervised methods to detect Insider Threat from Enterprise Social and Online Activity Data , 2015, J. Wirel. Mob. Networks Ubiquitous Comput. Dependable Appl..

[9]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[10]  Kalyan Veeramachaneni,et al.  AI^2: Training a Big Data Machine to Defend , 2016, 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS).

[11]  Mikel Galar,et al.  Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches , 2013, Knowl. Based Syst..

[12]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[13]  Jun Zhang,et al.  Detecting and Preventing Cyber Insider Threats: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[14]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[15]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Thomas G. Dietterich,et al.  Detecting insider threats in a real corporate database of computer usage activity , 2013, KDD.

[17]  Jason R. C. Nurse,et al.  A New Take on Detecting Insider Threats: Exploring the Use of Hidden Markov Models , 2016, MIST@CCS.

[18]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[19]  Yanbing Liu,et al.  Insider Threat Detection with Deep Neural Network , 2018, ICCS.

[20]  Sadie Creese,et al.  Automated Insider Threat Detection System Using User and Role-Based Profile Assessment , 2017, IEEE Systems Journal.