Hybrid approach for Anomaly Detection in Time Series Data

Anomaly detection is an active research field which attracts the attention of many business and research actors. It has led to several research projects depending on the nature of the data, the availability of labels on normality, and domains of application that are diverse such as fraud detection, medical domains, cloud monitoring or network intrusions detection, etc. However, dealing with effective anomaly detection for complex and high-dimensional time series data remains a challenging task. In this work, we propose hybrid approach composed of an LSTM Autoencoder trained on normal records to learn efficient normal sequence representations combined with an SVM classifier for anomaly detection. Experimental results show that by encoding time series via a pretrained LSTM encoder allows efficient representation of data so that we can accurately detect abnormal records. In fact, the encoded representation reduces significantly the correlations between normal and abnormal records and allows us to have an efficient latent data representation that separates consistently the two classes. The proposed hybrid approach outperforms state-of-the art approaches [1], [2], [3], [4].

[1]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[2]  Md. Rafiqul Islam,et al.  A survey of anomaly detection techniques in financial domain , 2016, Future Gener. Comput. Syst..

[3]  Chao Chen,et al.  Using Random Forest to Learn Imbalanced Data , 2004 .

[4]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[5]  Jianhua Liu,et al.  Deep Sparse Autoencoder for Feature Extraction and Diagnosis of Locomotive Adhesion Status , 2018, J. Control. Sci. Eng..

[6]  Thomas Blaschke,et al.  Application of Generative Autoencoder in De Novo Molecular Design , 2017, Molecular informatics.

[7]  Pierre Baldi,et al.  Autoencoders, Unsupervised Learning, and Deep Architectures , 2011, ICML Unsupervised and Transfer Learning.

[8]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[9]  Sandeep Sharma,et al.  Anomaly Detection in Medical Wireless Sensor Networks using Machine Learning Algorithms , 2015 .

[10]  Marco Maggipinto,et al.  A Convolutional Autoencoder Approach for Feature Extraction in Virtual Metrology , 2018 .

[11]  Malik Yousef,et al.  One-class document classification via Neural Networks , 2007, Neurocomputing.

[12]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[13]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[14]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[15]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[16]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[18]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[19]  Nitish Srivastava,et al.  Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.

[20]  Lovekesh Vig,et al.  LSTM-based Encoder-Decoder for Multi-sensor Anomaly Detection , 2016, ArXiv.

[21]  George J. Knafl,et al.  Logistic regression modeling for context-based classification , 1999, Proceedings. Tenth International Workshop on Database and Expert Systems Applications. DEXA 99.

[22]  Lovekesh Vig,et al.  Long Short Term Memory Networks for Anomaly Detection in Time Series , 2015, ESANN.

[23]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..