Securing Deep Learning Models with Autoencoder based Anomaly Detection

Deep learning models are on the rise in many scientific fields. Their ability to solve complex and nonlinear tasks has made them very popular. However, in comparison to physical models, they struggle with extrapolation. Hence, it is important that the input data in the production stage is similar to the data seen in training. Deviations to the training often occur in real world applications due to sensor delays and drifts, aging of the system, and communication errors such as noise. Especially in safety relevant applications, securing those models against these influences, which can be seen as anomalies, is essential.   In the past, Autoencoders, especially Variational Autoencoders (VAEs), have been proven useful for anomaly detection. Many researches focus on improving the Autoencoder’s separation ability for an optimal set anomaly threshold. However, the setting of the threshold is not trivial and is crucial for a good anomaly detection. Setting the threshold optimal becomes especially challenging if the anomaly is unknown in the training process, which is often the case in a real world application.   The proposed method combines a deep learning model with an Autoencoder. The input data is handed to the trained Autoencoder which reconstructs the input. If the data is similar to the training data, the Autoencoder should be able to reconstruct the input data accurately. Otherwise, an anomaly is suspected. The reconstruction and the original input data are both passed through the deep learning model, generating two predictions, which are then compared.   For classification and reinforcement tasks with discrete result space the prediction of non-anomalous data should lead to the same class for both samples. For those tasks this allows us to sort out samples as anomalies for which the two results are not the same, hence the threshold becomes obsolete.   For regression and continuous reinforcement tasks, the difference between the two predictions can be interpreted as a safety measure and is easier to grasp than the Autoencoder’s reconstruction error, which is typically used.   The advantage of this method is the distinction between samples that can or cannot be handled by the subsequent application model instead of just deciding if the input is anomalous. This leads to a higher robustness of the joint model and a better usage of resources of the deep leaning model. Moreover, the Autoencoder and the deep learning model are trained separately which makes the training a lot more stable than using coupled training methods.   The proposed method is proven on both, a classification and a regression task. For the classification, the publicly available UEA multivariate time series classification dataset was used. For regression, a dataset simulating a SCR catalyst as part of an automotive exhaust gas aftertreatment system was evaluated. For both tasks common anomalies such as delay and noise were applied to the data.

[1]  Stephen J. Roberts,et al.  Anomaly Detection for Time Series Using VAE-LSTM Hybrid Model , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Ran El-Yaniv,et al.  SelectiveNet: A Deep Neural Network with an Integrated Reject Option , 2019, ICML.

[3]  Wanlei Zhao,et al.  Sequential VAE-LSTM for Anomaly Detection on Time Series , 2019, ArXiv.

[4]  Ran El-Yaniv,et al.  On the Foundations of Noise-free Selective Classification , 2010, J. Mach. Learn. Res..

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[7]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[8]  Yang Feng,et al.  Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications , 2018, WWW.

[9]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[10]  Carl Doersch,et al.  Tutorial on Variational Autoencoders , 2016, ArXiv.

[11]  Gregor Gelbert,et al.  Hybrid Modeling of a Catalyst with Autoencoder Based Selection Strategy , 2020 .

[12]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[13]  Bo Zong,et al.  A Deep Neural Network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data , 2018, AAAI.

[14]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[15]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[16]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[17]  Christian März,et al.  Approaches for a New Generation of Fast-Computing Catalyst Models , 2020, Emission Control Science and Technology.

[18]  Brandon Pincombea,et al.  Anomaly Detection in Time Series of Graphs using ARMA Processes , 2007 .

[19]  Mingyan Teng,et al.  Anomaly detection on time series , 2010, 2010 IEEE International Conference on Progress in Informatics and Computing.

[20]  Sungzoon Cho,et al.  Variational Autoencoder based Anomaly Detection using Reconstruction Probability , 2015 .

[21]  Graham W. Taylor,et al.  Learning Confidence for Out-of-Distribution Detection in Neural Networks , 2018, ArXiv.

[22]  Michael Flynn,et al.  The UEA multivariate time series classification archive, 2018 , 2018, ArXiv.

[23]  Bo Zong,et al.  Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection , 2018, ICLR.

[24]  Christian Berger,et al.  Towards Structured Evaluation of Deep Neural Network Supervisors , 2019, 2019 IEEE International Conference On Artificial Intelligence Testing (AITest).

[25]  Georg Langs,et al.  Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery , 2017, IPMI.

[26]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[27]  Jonathan D. Cryer,et al.  Time Series Analysis , 1986 .

[28]  Lovekesh Vig,et al.  Long Short Term Memory Networks for Anomaly Detection in Time Series , 2015, ESANN.