Explainable LSTM Model for Anomaly Detection in HDFS Log File using Layerwise Relevance Propagation

Anomaly detection has always been of utmost importance especially in log file systems. Many different supervised techniques have been explored to deal with this problem. Deep Learning approaches have shown huge promise in log file anomaly detection systems due to their superior ability to learn high level features and non-linearities eliminating the need for any domain specific knowledge or special pre-processing. But this increased performance comes at the cost of inexplicability of the outcomes resulting from the black-box nature of such models. In this paper, we propose a solution utilizing a LSTM-LRP (Long Short Term Memory - Layerwise Relevance Propagation) architecture for discrete event sequences which are obtained by processing log files using log keys derived from individual entries. We extend the idea of LSTM-LRP, used in NLP problems to Log file Systems. The model is evaluated on Hadoop Distributed File System (HDFS) logs where an interpretation for every timestep and every feature is provided. Our major concern in this paper is the interpretation of the results over accuracy of the model. This not only offers an interpretation of the outcomes but also helps build trust in the model by making sure that spurious correlations are avoided making it suitable for real life applications.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Yoshua Bengio,et al.  The problem of learning long-term dependencies in recurrent networks , 1993, IEEE International Conference on Neural Networks.

[3]  Klaus-Robert Müller,et al.  Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models , 2017, ArXiv.

[4]  Klaus-Robert Müller,et al.  Explaining Recurrent Neural Network Predictions in Sentiment Analysis , 2017, WASSA@EMNLP.

[5]  Michael I. Jordan,et al.  Detecting large-scale system problems by mining console logs , 2009, SOSP '09.

[6]  David Gunning,et al.  DARPA's explainable artificial intelligence (XAI) program , 2019, IUI.

[7]  Klaus-Robert Müller,et al.  "What is relevant in a text document?": An interpretable machine learning approach , 2016, PloS one.

[8]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[9]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[10]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[11]  Jürgen Schmidhuber,et al.  Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..

[12]  Feifei Li,et al.  DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning , 2017, CCS.

[13]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  PAUL J. WERBOS,et al.  Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[16]  Klaus-Robert Müller,et al.  Explaining Predictions of Non-Linear Classifiers in NLP , 2016, Rep4NLP@ACL.