ADA: Adaptive Deep Log Anomaly Detector

Large private and government networks are often subjected to attacks like data extrusion and service disruption. Existing anomaly detection systems use offline supervised learning and employ experts for labeling. Hence they cannot detect anomalies in real-time. Even though unsupervised algorithms are increasingly used nowadays, they cannot readily adapt to newer threats. Moreover, many such systems also suffer from high cost of storage and require extensive computational resources. In this paper, we propose ADA: Adaptive Deep Log Anomaly Detector, an unsupervised online deep neural network framework that leverages LSTM networks and regularly adapts to newer log patterns to ensure accurate anomaly detection. In ADA, an adaptive model selection strategy is designed to choose pareto-optimal configurations and thereby utilize resources efficiently. Further, a dynamic threshold algorithm is proposed to dictate the optimal threshold based on recently detected events to improve the detection accuracy. We also use the predictions to guide storage of abnormal data and effectively reduce the overall storage cost. We compare ADA with state-of-the-art approaches through leveraging the Los Alamos National Laboratory cyber security dataset and show that ADA accurately detects anomalies with high F1-score ~95% and it is 97 times faster than existing approaches and incurs very low storage cost.

[1]  W. J. Dixon,et al.  Introduction to Statistical Analysis , 1951 .

[2]  Yuval Elovici,et al.  Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection , 2018, NDSS.

[3]  Steven C. H. Hoi,et al.  Online Deep Learning: Learning Deep Neural Networks on the Fly , 2017, IJCAI.

[4]  Leonardo Aguayo,et al.  Novelty Detection in Time Series Using Self-Organizing Neural Networks: A Comprehensive Evaluation , 2017, Neural Processing Letters.

[5]  Sanjay Chawla,et al.  Deep Learning for Anomaly Detection: A Survey , 2019, ArXiv.

[6]  Ding Yuan,et al.  SherLog: error diagnosis by connecting clues from run-time logs , 2010, ASPLOS 2010.

[7]  Dieter Hogrefe,et al.  A Novel Semi-Supervised Adaboost Technique for Network Anomaly Detection , 2016, MSWiM.

[8]  Yun Fu,et al.  Max-Margin Action Prediction Machine , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[10]  Bo Zong,et al.  Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection , 2018, ICLR.

[11]  Feifei Li,et al.  DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning , 2017, CCS.

[12]  Brian Hutchinson,et al.  Recurrent Neural Network Language Models for Open Vocabulary Event-Level Cyber Anomaly Detection , 2017, AAAI Workshops.

[13]  Dan Pei,et al.  Label-Less: A Semi-Automatic Labelling Tool for KPI Anomalies , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[14]  Taku Kudo,et al.  SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.

[15]  Pierre Baldi,et al.  Autoencoders, Unsupervised Learning, and Deep Architectures , 2011, ICML Unsupervised and Transfer Learning.

[16]  Alexander D. Kent,et al.  Cyber security data sources for dynamic network research , 2016 .

[17]  Xiaohua Tian,et al.  Detecting Anomaly in Large-scale Network using Mobile Crowdsourcing , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[18]  Niloy Ganguly,et al.  ADELE: Anomaly Detection from Event Log Empiricism , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[19]  William H. Sanders,et al.  An Unsupervised Multi-Detector Approach for Identifying Malicious Lateral Movement , 2017, 2017 IEEE 36th Symposium on Reliable Distributed Systems (SRDS).

[20]  Yang Feng,et al.  Unsupervised Anomaly Detection for Intricate KPIs via Adversarial Training of VAE , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[21]  Asaf Shabtai,et al.  A Machine Learning-Based Intrusion Detection System for Securing Remote Desktop Connections to Electronic Flight Bag Servers , 2021, IEEE Transactions on Dependable and Secure Computing.

[22]  Argyris Kalogeratos,et al.  A Probabilistic Framework to Node-level Anomaly Detection in Communication Networks , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[23]  Andy Brown,et al.  Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection , 2018, Proceedings of the First Workshop on Machine Learning for Computing Systems.

[24]  Rena S. Miller,et al.  The Target and Other Financial Data Breaches: Frequently Asked Questions , 2015 .

[25]  Thomas S. Huang,et al.  One-class SVM for learning in image retrieval , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[26]  Fei Tony Liu,et al.  Isolation-Based Anomaly Detection , 2012, TKDD.

[27]  Stefan Axelsson,et al.  The base-rate fallacy and the difficulty of intrusion detection , 2000, TSEC.