Multivariate anomaly detection based on prediction intervals constructed using deep learning

It has been shown that deep learning models can under certain circumstances outperform traditional statistical methods at forecasting. Furthermore, various techniques have been developed for quantifying the forecast uncertainty (prediction intervals). In this paper, we utilize prediction intervals constructed with the aid of artificial neural networks to detect anomalies in the multivariate setting. Challenges with existing deep learning-based anomaly detection approaches include ( i ) large sets of parameters that may be computationally intensive to tune, ( ii ) returning too many false positives rendering the techniques impractical for use, and ( iii ) requiring labeled datasets for training which are often not prevalent in real life. Our approach overcomes these challenges. We benchmark our approach against the oft-preferred well-established statistical models. We focus on three deep learning architectures, namely cascaded neural networks, reservoir computing, and long short-term memory recurrent neural networks. Our finding is deep learning outperforms (or at the very least is competitive to) the latter.

[1]  Slawek Smyl,et al.  A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting , 2020, International Journal of Forecasting.

[2]  Jiong Jin,et al.  A comprehensive survey of anomaly detection techniques for high dimensional big data , 2020, Journal of Big Data.

[3]  Diane Ahrens,et al.  Application of SARIMAX Model to Forecast Daily Sales in Food Retail Industry , 2016, Int. J. Oper. Res. Inf. Syst..

[4]  Johannes Fürnkranz,et al.  Mean Absolute Error , 2010, Encyclopedia of Machine Learning and Data Mining.

[5]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[6]  Herbert Jaeger,et al.  The''echo state''approach to analysing and training recurrent neural networks , 2001 .

[7]  Shailendra Kadre,et al.  Introduction to Statistical Analysis , 2015 .

[8]  Jose A. Lozano,et al.  Analyzing rare event, anomaly, novelty and outlier detection terms under the supervised classification framework , 2019, Artificial Intelligence Review.

[9]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[10]  Multiple Linear Regression , 2005 .

[11]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Witold Pedrycz,et al.  Multivariate time series anomaly detection: A framework of Hidden Markov Models , 2017, Appl. Soft Comput..

[14]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[15]  Z. Irani,et al.  Critical analysis of Big Data challenges and analytical methods , 2017 .

[16]  Carlos F.M. Coimbra,et al.  Chapter 15 – Stochastic-Learning Methods , 2013 .

[17]  E. Hannan,et al.  The statistical theory of linear systems , 1989 .

[18]  Fotios Petropoulos,et al.  Forecasting in social settings: The state of the art , 2020, International Journal of Forecasting.

[19]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[20]  Evangelos Spiliotis,et al.  The M4 Competition: 100,000 time series and 61 forecasting methods , 2020 .

[21]  E. Monte,et al.  Effects of removing the trend and the seasonal component on the forecasting performance of artificial neural network techniques , 2015 .

[22]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[23]  Lei Xu,et al.  ADE: An ensemble approach for early Anomaly Detection , 2017, 2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM).

[24]  Jinoh Kim,et al.  A survey of deep learning-based network anomaly detection , 2017, Cluster Computing.

[25]  Skipper Seabold,et al.  Statsmodels: Econometric and Statistical Modeling with Python , 2010, SciPy.

[26]  Douglas M. Hawkins Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.

[27]  Lovekesh Vig,et al.  Long Short Term Memory Networks for Anomaly Detection in Time Series , 2015, ESANN.

[28]  Thomas R. Shultz,et al.  Cascade Correlation , 2010, Encyclopedia of Machine Learning and Data Mining.

[29]  Irma J. Terpenning,et al.  STL : A Seasonal-Trend Decomposition Procedure Based on Loess , 1990 .

[30]  Terence L van Zyl,et al.  Prediction Interval Construction for Multivariate Point Forecasts Using Deep Learning , 2020, 2020 7th International Conference on Soft Computing & Machine Intelligence (ISCMI).

[31]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[32]  Nikolay Laptev,et al.  Deep and Confident Prediction for Time Series at Uber , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[33]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[34]  T. Hesterberg,et al.  What Teachers Should Know About the Bootstrap: Resampling in the Undergraduate Statistics Curriculum , 2014, The American statistician.

[35]  Georg Langs,et al.  Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery , 2017, IPMI.

[36]  Thomas Bäck,et al.  Online anomaly detection on the webscope S5 dataset: A comparative study , 2017, 2017 Evolving and Adaptive Intelligent Systems (EAIS).

[37]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[38]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[39]  Lovekesh Vig,et al.  Anomaly detection in ECG time signals via deep long short-term memory networks , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[40]  Sridhar Alla,et al.  Practical Use Cases of Anomaly Detection , 2019 .

[41]  Linton G. Freeman,et al.  Elementary Applied Statistics for students in Behavioral Science , 1965 .

[42]  Lovekesh Vig,et al.  LSTM-based Encoder-Decoder for Multi-sensor Anomaly Detection , 2016, ArXiv.

[43]  Charu C. Aggarwal Probabilistic and Statistical Models for Outlier Detection , 2013 .