EXAD: A System for Explainable Anomaly Detection on Big Data Traces

Big Data systems are producing huge amounts of data in real-time. Finding anomalies in these systems is becoming increasingly important, since it can help to reduce the number of failures, and improve the mean time of recovery. In this work, we present EXAD, a new prototype system for explainable anomaly detection, in particular for detecting and explaining anomalies in time-series data obtained from traces of Apache Spark jobs. Apache Spark has become the most popular software tool for processing Big Data. The new system contains the most well-known approaches to anomaly detection, and a novel generator of artificial traces, that can help the user to understand the different performances of the different methodologies. In this demo, we will show how this new framework works, and how users can benefit of detecting anomalies in an efficient and fast way when dealing with traces of jobs of Big Data systems.

[1]  Mike Wu,et al.  Beyond Sparsity: Tree Regularization of Deep Models for Interpretability , 2017, AAAI.

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Nhien-An Le-Khac,et al.  Collective Anomaly Detection Based on Long Short-Term Memory Recurrent Neural Networks , 2016, FDSE.

[4]  Jugal K. Kalita,et al.  Network Anomaly Detection: Methods, Systems and Tools , 2014, IEEE Communications Surveys & Tutorials.

[5]  Brian Litt,et al.  Semi-Supervised Anomaly Detection for EEG Waveforms Using Deep Belief Nets , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[6]  Charu C. Aggarwal,et al.  Outlier Detection for Temporal Data: A Survey , 2014, IEEE Transactions on Knowledge and Data Engineering.

[7]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[8]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[9]  Lovekesh Vig,et al.  Long Short Term Memory Networks for Anomaly Detection in Time Series , 2015, ESANN.

[10]  Hongxing He,et al.  Outlier Detection Using Replicator Neural Networks , 2002, DaWaK.

[11]  Haopeng Zhang,et al.  EXstream: Explaining Anomalies in Event Stream Monitoring , 2017, EDBT.

[12]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[13]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[14]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[15]  Fei Tony Liu,et al.  Isolation-Based Anomaly Detection , 2012, TKDD.