Anomaly Detection at Scale: The Case for Deep Distributional Time Series Models

This paper introduces a new methodology for detecting anomalies in time series data, with a primary application to monitoring the health of (micro-) services and cloud resources. The main novelty in our approach is that instead of modeling time series consisting of real values or vectors of real values, we model time series of probability distributions over real values (or vectors). This extension to time series of probability distributions allows the technique to be applied to the common scenario where the data is generated by requests coming in to a service, which is then aggregated at a fixed temporal frequency. Our method is amenable to streaming anomaly detection and scales to monitoring for anomalies on millions of time series. We show the superior accuracy of our method on synthetic and public real-world data. On the Yahoo Webscope data set, we outperform the state of the art in 3 out of 4 data sets and we show that we outperform popular open-source anomaly detection tools by up to 17% average improvement for a real-world data set.

[1]  Dino Sejdinovic,et al.  Bayesian Approaches to Distribution Regression , 2017, AISTATS.

[2]  Arnaud Doucet,et al.  Bayesian Inference for Linear Dynamic Models With Dirichlet Process Mixtures , 2007, IEEE Transactions on Signal Processing.

[3]  Andreas Dengel,et al.  FuseAD: Unsupervised Anomaly Detection in Streaming Sensors Data by Fusing Statistical and Deep Learning Models , 2019, Sensors.

[4]  Barnabás Póczos,et al.  Distribution-Free Distribution Regression , 2013, AISTATS.

[5]  Su Fong Chien,et al.  ARIMA Based Network Anomaly Detection , 2010, 2010 Second International Conference on Communication Software and Networks.

[6]  Rob J Hyndman,et al.  Computing and Graphing Highest Density Regions , 1996 .

[7]  Chang Sik Kim,et al.  Evaluating trends in time series of distributions: A spatial fingerprint of human effects on climate , 2020 .

[8]  Chang Sik Kim,et al.  Nonstationarity in time series of state densities , 2016 .

[9]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[10]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[11]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[12]  Lovekesh Vig,et al.  Long Short Term Memory Networks for Anomaly Detection in Time Series , 2015, ESANN.

[13]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[14]  W. R. Buckland,et al.  Outliers in Statistical Data , 1979 .

[15]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[16]  Arun Kejariwal,et al.  Automatic Anomaly Detection in the Cloud Via Statistical Learning , 2017, ArXiv.

[17]  Eamonn J. Keogh,et al.  Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View That Includes Motifs, Discords and Shapelets , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[18]  Felix Bießmann,et al.  Automating Large-Scale Data Quality Verification , 2018, Proc. VLDB Endow..

[19]  Syama Sundar Rangapuram,et al.  GluonTS: Probabilistic Time Series Models in Python , 2019, ArXiv.

[20]  Hwee Kuan Lee,et al.  A compact network learning model for distribution regression , 2018, Neural Networks.

[21]  A. Madansky Identification of Outliers , 1988 .

[22]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[23]  Christos Faloutsos,et al.  Forecasting Big Time Series: Old and New , 2018, Proc. VLDB Endow..

[24]  M.A. Masnadi-Shirazi,et al.  Arima model for network traffic prediction and anomaly detection , 2008, 2008 International Symposium on Information Technology.

[25]  Enrique ter Horst,et al.  Bayesian dynamic density estimation , 2008 .

[26]  Chao Yi,et al.  Time-Series Anomaly Detection Service at Microsoft , 2019, KDD.

[27]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[28]  Pichao Wang,et al.  RobustTAD: Robust Time Series Anomaly Detection via Decomposition and Convolutional Neural Networks , 2020, ArXiv.

[29]  Bernhard Schölkopf,et al.  Learning from Distributions via Support Measure Machines , 2012, NIPS.

[30]  Rob J. Hyndman,et al.  Robust forecasting of mortality and fertility rates: A functional data approach , 2007, Comput. Stat. Data Anal..

[31]  Han Lin Shang,et al.  Forecasting functional time series , 2009 .

[32]  Charu C. Aggarwal,et al.  Outlier Analysis , 2013, Springer New York.

[33]  Arthur Gretton,et al.  Learning Theory for Distribution Regression , 2014, J. Mach. Learn. Res..

[34]  Valentin Flunkert,et al.  DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks , 2017, International Journal of Forecasting.

[35]  Junhui Qian,et al.  Functional regression of continuous state distributions , 2012 .

[36]  Piotr Kokoszka,et al.  Inference for Functional Data with Applications , 2012 .

[37]  Philip S. Yu,et al.  Time Series Data Cleaning: From Anomaly Detection to Anomaly Repairing , 2017, Proc. VLDB Endow..

[38]  Barnabás Póczos,et al.  Distribution to Distribution Regression , 2013, ICML.

[39]  Z. Q. John Lu,et al.  Nonparametric Functional Data Analysis: Theory And Practice , 2007, Technometrics.

[40]  Christos Faloutsos,et al.  Classical and Contemporary Approaches to Big Time Series Forecasting , 2019, SIGMOD Conference.

[41]  Piotr Kokoszka,et al.  Forecasting of density functions with an application to cross-sectional and intraday returns , 2019, International Journal of Forecasting.

[42]  Antonio Muñoz San Roque,et al.  Forecasting Functional Time Series with a New Hilbertian ARMAX Model: Application to Electricity Price Forecasting , 2018, IEEE Transactions on Power Systems.

[43]  Sudipto Guha,et al.  Robust Random Cut Forest Based Anomaly Detection on Streams , 2016, ICML.

[44]  Syama Sundar Rangapuram,et al.  Probabilistic Forecasting with Spline Quantile Function RNNs , 2019, AISTATS.

[45]  Hongzhi Wang,et al.  Cleanits: A Data Cleaning System for Industrial Time Series , 2019, Proc. VLDB Endow..

[46]  Matteo Ruggiero,et al.  Dynamic density estimation with diffusive Dirichlet mixtures , 2014, 1410.2477.