Robust Anomaly Detection in Time Series through Variational AutoEncoders and a Local Similarity Score

The rise of time series data availability has demanded new techniques for its automated analysis regarding several tasks, including anomaly detection. However, even though the volume of time series data is rapidly increasing, the lack of labeled abnormal samples is still an issue, hindering the performance of most supervised anomaly detection models. In this paper, we present an unsupervised framework comprised of a Variational Autoencoder coupled with a local similarity score, which learns solely on available normal data to detect abnormalities in new data. Nonetheless, we propose two techniques to improve the results if at least some abnormal samples are available. These include a training set cleaning method for removing the influence of corrupted data on detection performance and the optimization of the detection threshold. Tests were performed in two datasets: ECG5000 and MIT-BIH Arrhythmia. Regarding the ECG5000 dataset, our framework has shown to outperform some supervised and unsupervised approaches found in the literature by achieving an AUC score of 98.79%. In the MIT-BIH dataset, the training set cleaning step removed 60% of the original training samples and improved the anomaly detection AUC score from 91.70% to 93.30%.

[1]  Zhizhong Liu,et al.  Incremental fuzzy C medoids clustering of time series data using dynamic time warping distance , 2018, PloS one.

[2]  Raghavendra Chalapathy University of Sydney,et al.  Deep Learning for Anomaly Detection: A Survey , 2019, ArXiv.

[3]  Jinfeng Yi,et al.  Similarity Preserving Representation Learning for Time Series Analysis , 2017, ArXiv.

[4]  Diederik P. Kingma,et al.  An Introduction to Variational Autoencoders , 2019, Found. Trends Mach. Learn..

[5]  Margarida Silveira,et al.  Unsupervised representation learning and anomaly detection in ECG sequences , 2019, Int. J. Data Min. Bioinform..

[6]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[7]  Yang Feng,et al.  Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications , 2018, WWW.

[8]  William Robson Schwartz,et al.  ECG-based heartbeat classification for arrhythmia detection: A survey , 2016, Comput. Methods Programs Biomed..

[9]  Georg Langs,et al.  Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery , 2017, IPMI.

[10]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[11]  G.B. Moody,et al.  The impact of the MIT-BIH Arrhythmia Database , 2001, IEEE Engineering in Medicine and Biology Magazine.

[12]  Lei Shi,et al.  MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks , 2019, ICANN.

[13]  Brandon Pincombea,et al.  Anomaly Detection in Time Series of Graphs using ARMA Processes , 2007 .

[14]  Philip de Chazal,et al.  Automatic classification of heartbeats using ECG morphology and heartbeat interval features , 2004, IEEE Transactions on Biomedical Engineering.

[15]  Eamonn J. Keogh,et al.  The UCR time series archive , 2018, IEEE/CAA Journal of Automatica Sinica.

[16]  Sebastian Wagner,et al.  Anomaly Detection in Univariate Time-series: A Survey on the State-of-the-Art , 2020, ArXiv.

[17]  Mohammad Norouzi,et al.  Don't Blame the ELBO! A Linear VAE Perspective on Posterior Collapse , 2019, NeurIPS.

[18]  Joo-Ho Lee,et al.  Heartbeat classification using local transform pattern feature and hybrid neural fuzzy-logic system based on self-organizing map , 2020, Biomed. Signal Process. Control..

[19]  Juan Pablo Martínez,et al.  Heartbeat Classification Using Feature Selection Driven by Database Generalization Criteria , 2011, IEEE Transactions on Biomedical Engineering.

[20]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[21]  Lovekesh Vig,et al.  TimeNet: Pre-trained deep recurrent neural network for time series classification , 2017, ESANN.

[22]  Bo Zong,et al.  A Deep Neural Network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data , 2018, AAAI.

[23]  Marcus D. Ruopp,et al.  Youden Index and Optimal Cut‐Point Estimated from Observations Affected by a Lower Limit of Detection , 2008, Biometrical journal. Biometrische Zeitschrift.

[24]  Houshang Darabi,et al.  LSTM Fully Convolutional Networks for Time Series Classification , 2017, IEEE Access.

[25]  J. Ma,et al.  Time-series novelty detection using one-class support vector machines , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..