Dynamically Modelling Heterogeneous Higher-Order Interactions for Malicious Behavior Detection in Event Logs

Anomaly detection in event logs is a promising approach for intrusion detection in enterprise networks. By building a statistical model of usual activity, it aims to detect multiple kinds of malicious behavior, including stealthy tactics, techniques and procedures (TTPs) designed to evade signature-based detection systems. However, finding suitable anomaly detection methods for event logs remains an important challenge. This results from the very complex, multi-faceted nature of the data: event logs are not only combinatorial, but also temporal and heterogeneous data, thus they fit poorly in most theoretical frameworks for anomaly detection. Most previous research focuses on one of these three aspects, building a simplified representation of the data that can be fed to standard anomaly detection algorithms. In contrast, we propose to simultaneously address all three of these characteristics through a specifically tailored statistical model. We introduce Decades, a dynamic, heterogeneous and combinatorial model for anomaly detection in event streams, and we demonstrate its effectiveness at detecting malicious behavior through experiments on a real dataset containing labelled red team activity. In particular, we empirically highlight the importance of handling the multiple characteristics of the data by comparing our model with state-of-the-art baselines relying on various data representations.

[1]  Francesco Sanna Passino,et al.  Graph link prediction in computer networks using Poisson matrix factorisation , 2020, The Annals of Applied Statistics.

[2]  Joshua Neil,et al.  Anomaly Detection in Large-Scale Networks With Latent Space Models , 2019, Technometrics.

[3]  Eric Totel,et al.  Sec2graph: Network Attack Detection Based on Novelty Detection on Graph Structured Data , 2020, DIMVA.

[4]  Brian A Powell Detecting malicious logins as graph anomalies , 2019, J. Inf. Secur. Appl..

[5]  H. Howie Huang,et al.  Detecting Lateral Movement in Enterprise Computer Networks with Unsupervised Graph AI , 2020, RAID.

[6]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[7]  Baris Coskun,et al.  CADENCE: Conditional Anomaly Detection for Events Using Noise-Contrastive Estimation , 2019, AISec@CCS.

[8]  Yu Wen,et al.  Log2vec: A Heterogeneous Graph Embedding Based Approach for Detecting Cyber Threats within Enterprise , 2019, CCS.

[9]  Olivier Thonnard,et al.  System Misuse Detection Via Informed Behavior Clustering and Modeling , 2019, 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W).

[10]  Jack W. Stokes,et al.  Detecting Cyber Attacks Using Anomaly Detection with Explanations and Expert Feedback , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  Aryeh Kontorovich,et al.  Temporal anomaly detection: calibrating the surprise , 2017, AAAI.

[12]  Steffen Haas,et al.  GAC: graph-based alert correlation for the detection of distributed multi-step attacks , 2018, SAC.

[13]  Brian Hutchinson,et al.  Recurrent Neural Network Language Models for Open Vocabulary Event-Level Cyber Anomaly Detection , 2017, AAAI Workshops.

[14]  Roberto Cipolla,et al.  Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Wei Cheng,et al.  Collaborative Alert Ranking for Anomaly Detection , 2016, CIKM.

[16]  Qiang Yang,et al.  An Overview of Multi-task Learning , 2018 .

[17]  Acar Tamersoy,et al.  Smoke Detector: Cross-Product Intrusion Detection With Weak Indicators , 2017, ACSAC.

[18]  Derek Lin,et al.  Anomalous User Activity Detection in Enterprise Multi-source Logs , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[19]  Derek Lin,et al.  Reducing False Positives of User-to-Entity First-Access Alerts for User Behavior Analytics , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[20]  Nasir D. Memon,et al.  Detecting Structurally Anomalous Logins Within Enterprise Networks , 2017, CCS.

[21]  Brian Hutchinson,et al.  Deep Learning for Unsupervised Insider Threat Detection in Structured Cybersecurity Data Streams , 2017, AAAI Workshops.

[22]  William H. Sanders,et al.  An Unsupervised Multi-Detector Approach for Identifying Malicious Lateral Movement , 2017, 2017 IEEE 36th Symposium on Reliable Distributed Systems (SRDS).

[23]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[24]  Sadie Creese,et al.  Automated Insider Threat Detection System Using User and Role-Based Profile Assessment , 2017, IEEE Systems Journal.

[25]  Fei Wang,et al.  HERCULE: attack story reconstruction via community discovery on correlated log graph , 2016, ACSAC.

[26]  Jisheng Wang,et al.  User and entity behavior analytics for enterprise security , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[27]  Juston Moore,et al.  Poisson factorization for peer-based anomaly detection , 2016, 2016 IEEE Conference on Intelligence and Security Informatics (ISI).

[28]  Jason R. C. Nurse,et al.  A New Take on Detecting Insider Threats: Exploring the Use of Hidden Markov Models , 2016, MIST@CCS.

[29]  Alexander D. Kent,et al.  Cyber security data sources for dynamic network research , 2016 .

[30]  Alexander D. Kent,et al.  Modelling user behaviour in a network using computer event logs , 2016 .

[31]  Kalyan Veeramachaneni,et al.  AI^2: Training a Big Data Machine to Defend , 2016, 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS).

[32]  Joshua Neil,et al.  Attack chain detection , 2015, Stat. Anal. Data Min..

[33]  Joao Bota,et al.  Big Data Analytics for Detecting Host Misbehavior in Large Logs , 2015, 2015 IEEE Trustcom/BigDataSE/ISPA.

[34]  Alexander D. Kent,et al.  Comprehensive, Multi-Source Cyber-Security Events Data Set , 2015 .

[35]  Lorie M. Liebrock,et al.  Authentication graphs: Analyzing user behavior within an enterprise network , 2015, Comput. Secur..

[36]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[37]  William K. Robertson,et al.  Beehive: large-scale log analysis for detecting suspicious activity in enterprise networks , 2013, ACSAC.

[38]  Koray Kavukcuoglu,et al.  Learning word embeddings efficiently with noise-contrastive estimation , 2013, NIPS.

[39]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[40]  Oliver Brdiczka,et al.  Multi-Domain Information Fusion for Insider Threat Detection , 2013, 2013 IEEE Security and Privacy Workshops.

[41]  Yee Whye Teh,et al.  A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.

[42]  Aapo Hyvärinen,et al.  Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[43]  Rich Caruana,et al.  Multitask Learning , 1997, Machine Learning.

[44]  Klaus Julisch,et al.  Clustering intrusion detection alarms to support root cause analysis , 2003, TSEC.

[45]  Alfonso Valdes,et al.  Probabilistic Alert Correlation , 2001, Recent Advances in Intrusion Detection.