AURORA: A Unified fRamework fOR Anomaly detection on multivariate time series

The ability to accurately and consistently discover anomalies in time series is important in many applications. Fields such as finance (fraud detection), information security (intrusion detection), healthcare, and others all benefit from anomaly detection. Intuitively, anomalies in time series are time points or sequences of time points that deviate from normal behavior characterized by periodic oscillations and long-term trends. For example, the typical activity on e-commerce websites exhibits weekly periodicity and grows steadily before holidays. Similarly, domestic usage of electricity exhibits daily and weekly oscillations combined with long-term season-dependent trends. How can we accurately detect anomalies in such domains while simultaneously learning a model for normal behavior? We propose a robust offline unsupervised framework for anomaly detection in seasonal multivariate time series, called AURORA. A key innovation in our framework is a general background behavior model that unifies periodicity and long-term trends. To this end, we leverage a Ramanujan periodic dictionary and a spline-based dictionary to capture both seasonal and trend patterns. We conduct experiments on both synthetic and real-world datasets and demonstrate the effectiveness of our method. AURORA has significant advantages over existing models for anomaly detection, including high accuracy (AUC of up to 0.98), interpretability of recovered normal behavior (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$100\%$$\end{document}100% accuracy in period detection), and the ability to detect both point and contextual anomalies. In addition, AURORA is orders of magnitude faster than baselines.

[1]  Arun Kejariwal,et al.  Automatic Anomaly Detection in the Cloud Via Statistical Learning , 2017, ArXiv.

[2]  P. P. Vaidyanathan,et al.  Nested Periodic Matrices and Dictionaries: New Signal Representations for Period Estimation , 2015, IEEE Transactions on Signal Processing.

[3]  Yang Feng,et al.  Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications , 2018, WWW.

[4]  Matthew E. P. Davies,et al.  BEAT TRACKING WITH A TWO STATE MODEL , 2005 .

[5]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[6]  Filip De Turck,et al.  A generalized matrix profile framework with support for contextual series analysis , 2020, Eng. Appl. Artif. Intell..

[7]  Klemens Böhm,et al.  HiCS: High Contrast Subspaces for Density-Based Outlier Ranking , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[8]  Bo Zong,et al.  A Deep Neural Network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data , 2018, AAAI.

[9]  Pasi Fränti,et al.  Outlier Detection Using k-Nearest Neighbour Graph , 2004, ICPR.

[10]  G. Nuel,et al.  Spline Regression with Automatic Knot Selection , 2018, 1808.01770.

[11]  Petko Bogdanov,et al.  Period Estimation For Incomplete Time Series , 2020, 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA).

[12]  Eamonn J. Keogh,et al.  Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View That Includes Motifs, Discords and Shapelets , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[13]  Lin Zhang,et al.  DSL: Discriminative Subgraph Learning via Sparse Self-Representation , 2019, SDM.

[14]  Cécile Viboud,et al.  Seasonality of influenza in Brazil: a traveling wave from the Amazon to the subtropics. , 2007, American journal of epidemiology.

[15]  Li Wei,et al.  Assumption-Free Anomaly Detection in Time Series , 2005, SSDBM.

[16]  Arun Kejariwal,et al.  A Novel Technique for Long-Term Anomaly Detection in the Cloud , 2014, HotCloud.

[17]  Jiawei Han,et al.  Spatiotemporal periodical pattern mining in traffic data , 2013, UrbComp '13.

[18]  Sungzoon Cho,et al.  Variational Autoencoder based Anomaly Detection using Reconstruction Probability , 2015 .

[19]  Philip K. Chan,et al.  Modeling multiple time series for anomaly detection , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[20]  Philip S. Yu,et al.  Time Series Data Cleaning: From Anomaly Detection to Anomaly Repairing , 2017, Proc. VLDB Endow..

[21]  Subutai Ahmad,et al.  Evaluating Real-Time Anomaly Detection Algorithms -- The Numenta Anomaly Benchmark , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[22]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD 2000.

[23]  Carson Kai-Sang Leung,et al.  Fifth IEEE International Conference on Data Mining (ICDM'05) , 2005 .

[24]  G. Moustakides,et al.  Sequential subspace change point detection , 2018, Sequential Analysis.

[25]  Diane J. Cook,et al.  A survey of methods for time series change point detection , 2017, Knowledge and Information Systems.

[26]  Jiawei Han,et al.  ePeriodicity: Mining Event Periodicity from Incomplete Observations , 2015, IEEE Transactions on Knowledge and Data Engineering.

[27]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[28]  Sofie Van Hoecke,et al.  Implications of Z-Normalization in the Matrix Profile , 2019, ICPRAM.

[29]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[30]  Thomas G. Dietterich,et al.  A Meta-Analysis of the Anomaly Detection Problem , 2015 .

[31]  Randy C. Paffenroth,et al.  Anomaly Detection with Robust Deep Autoencoders , 2017, KDD.

[32]  Rob J Hyndman,et al.  Probabilistic energy forecasting: Global Energy Forecasting Competition 2014 and beyond , 2016 .

[33]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[34]  Wenyu Zhang,et al.  ABACUS: Unsupervised Multivariate Change Detection via Bayesian Source Separation , 2018, SDM.

[35]  P. Fearnhead,et al.  Optimal detection of changepoints with a linear computational cost , 2011, 1101.1438.

[36]  Jean-Philippe Vert,et al.  The group fused Lasso for multiple change-point detection , 2011, 1106.4199.

[37]  Michael Small,et al.  Surrogate test to distinguish between chaotic and pseudoperiodic time series. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[38]  Saeed Amizadeh,et al.  Generic and Scalable Framework for Automated Time-series Anomaly Detection , 2015, KDD.

[39]  Jiawei Han,et al.  Mining periodic behaviors for moving objects , 2010, KDD.

[40]  Dennis Shasha,et al.  StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time , 2002, VLDB.

[41]  Geoffrey J. Gordon,et al.  Automatic Database Management System Tuning Through Large-scale Machine Learning , 2017, SIGMOD Conference.

[42]  Larry Wasserman,et al.  TO PROBABILITY AND MATHEMATICAL STATISTICS , 2017 .

[43]  Piotr Indyk,et al.  Identifying Representative Trends in Massive Time Series Data Sets Using Sketches , 2000, VLDB.

[44]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[45]  Dan Pei,et al.  Opprentice: Towards Practical and Automatic Anomaly Detection Through Machine Learning , 2015, Internet Measurement Conference.

[46]  Paul H. C. Eilers,et al.  Splines, knots, and penalties , 2010 .

[47]  Philip S. Yu,et al.  On Periodicity Detection and Structural Periodic Similarity , 2005, SDM.

[48]  Valentino Constantinou,et al.  Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding , 2018, KDD.

[49]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[50]  Wenyu Zhang,et al.  Pruning and Nonparametric Multiple Change Point Detection , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[51]  Wei Zhang,et al.  PRED: Periodic Region Detection for Mobility Modeling of Social Media Users , 2017, WSDM.

[52]  Lon-Mu Liu,et al.  Joint Estimation of Model Parameters and Outlier Effects in Time Series , 1993 .

[53]  Markus Goldstein,et al.  Anomaly Detection in Large Datasets , 2014 .