LogGAN: A Sequence-Based Generative Adversarial Network for Anomaly Detection Based on System Logs

System logs which trace system states and record valuable events comprise a significant component of any computer system in our daily life. There exist abundant information (i.e., normal and abnormal instances) involved in logs which assist administrators in diagnosing and maintaining the operation of the system. If diverse and complex anomalies (i.e., bugs and failures) cannot be detected and eliminated efficiently, the running workflows and transactions, even the system, would break down. Therefore, anomaly detection has become increasingly significant and attracted a lot of research attention. However, current approaches concentrate on the anomaly detection in a high-level granularity of logs (i.e., session) instead of detecting log-level anomalies which weakens the efficiency of responding anomalies and the diagnosis of system failures. To overcome the limitation, we propose a sequence-based generative adversarial network for anomaly detection based on system logs named LogGAN which detects log-level anomalies based on the patterns (i.e., the combination of latest logs). In addition, the generative adversarial network-based model relieves the effect of imbalance between normal and abnormal instances to improve the performance of capturing anomalies. To evaluate LogGAN, we conduct extensive experiments on two real-world datasets, and the experimental results show the effectiveness of our proposed approach to log-level anomaly detection.

[1]  Peng Zhang,et al.  IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models , 2017, SIGIR.

[2]  Tao Li,et al.  Event Extraction from Streaming System Logs , 2018, ICISA.

[3]  Feifei Li,et al.  DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning , 2017, CCS.

[4]  Jung-Tae Lee,et al.  CFGAN: A Generic Collaborative Filtering Framework based on Generative Adversarial Networks , 2018, CIKM.

[5]  Sanjay Chawla,et al.  SLOM: a new measure for local spatial outliers , 2006, Knowledge and Information Systems.

[6]  Brian Hutchinson,et al.  Recurrent Neural Network Language Models for Open Vocabulary Event-Level Cyber Anomaly Detection , 2017, AAAI Workshops.

[7]  Michael I. Jordan,et al.  Failure diagnosis using decision trees , 2004 .

[8]  Tao Li,et al.  LogSig: generating system events from raw textual logs , 2011, CIKM '11.

[9]  Zibin Zheng,et al.  Tools and Benchmarks for Automated Log Parsing , 2018, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).

[10]  Sanjay Chawla,et al.  On local spatial outliers , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[11]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[12]  Qiang Fu,et al.  Mining Invariants from Console Logs for System Problem Detection , 2010, USENIX Annual Technical Conference.

[13]  Bin Xia,et al.  An Effective Classification-Based Framework for Predicting Cloud Capacity Demand in Cloud Services , 2021, IEEE Transactions on Services Computing.

[14]  Michael I. Jordan,et al.  Detecting large-scale system problems by mining console logs , 2009, SOSP '09.

[15]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[16]  Hui Xiong,et al.  Failure Prediction in IBM BlueGene/L Event Logs , 2007, ICDM.

[17]  Ji Zhang,et al.  Detecting outlying subspaces for high-dimensional data: the new task, algorithms, and performance , 2006, Knowledge and Information Systems.

[18]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[19]  Yu Zhang,et al.  Log Clustering Based Problem Identification for Online Service Systems , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[20]  Armando Fox,et al.  Fingerprinting the datacenter: automated classification of performance crises , 2010, EuroSys '10.

[21]  Qing Wang,et al.  FIU-Miner (a fast, integrated, and user-friendly system for data mining) and its applications , 2017, Knowledge and Information Systems.