FuseAD: Unsupervised Anomaly Detection in Streaming Sensors Data by Fusing Statistical and Deep Learning Models

The need for robust unsupervised anomaly detection in streaming data is increasing rapidly in the current era of smart devices, where enormous data are gathered from numerous sensors. These sensors record the internal state of a machine, the external environment, and the interaction of machines with other machines and humans. It is of prime importance to leverage this information in order to minimize downtime of machines, or even avoid downtime completely by constant monitoring. Since each device generates a different type of streaming data, it is normally the case that a specific kind of anomaly detection technique performs better than the others depending on the data type. For some types of data and use-cases, statistical anomaly detection techniques work better, whereas for others, deep learning-based techniques are preferred. In this paper, we present a novel anomaly detection technique, FuseAD, which takes advantage of both statistical and deep-learning-based approaches by fusing them together in a residual fashion. The obtained results show an increase in area under the curve (AUC) as compared to state-of-the-art anomaly detection methods when FuseAD is tested on a publicly available dataset (Yahoo Webscope benchmark). The obtained results advocate that this fusion-based technique can obtain the best of both worlds by combining their strengths and complementing their weaknesses. We also perform an ablation study to quantify the contribution of the individual components in FuseAD, i.e., the statistical ARIMA model as well as the deep-learning-based convolutional neural network (CNN) model.

[1]  Ilaria Ballarini,et al.  Data analytics for occupancy pattern learning to reduce the energy consumption of HVAC systems in office buildings , 2017 .

[2]  Lovekesh Vig,et al.  Anomaly detection in ECG time signals via deep long short-term memory networks , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[3]  Subutai Ahmad,et al.  Evaluating Real-Time Anomaly Detection Algorithms -- The Numenta Anomaly Benchmark , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[4]  Bora Caglayan,et al.  DeepAD: A Generic Framework Based on Deep Learning for Time Series Anomaly Detection , 2018, PAKDD.

[5]  Jungwon Lee,et al.  Fused DNN: A Deep Neural Network Fusion Approach to Fast and Robust Pedestrian Detection , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[6]  J. Ma,et al.  Time-series novelty detection using one-class support vector machines , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[7]  Gurudeo Anand Tularam,et al.  The Tea Industry and a Review of Its Price Modelling in Major Tea Producing Countries , 2016 .

[8]  Efraim Turban,et al.  Neural Networks in Finance and Investing: Using Artificial Intelligence to Improve Real-World Performance , 1992 .

[9]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[10]  Sotiris Ioannidis,et al.  MIDeA: a multi-parallel intrusion detection architecture , 2011, CCS '11.

[11]  Phyks Introducing practical and robust anomaly detection in a time series | Twitter Blogs , 2015 .

[12]  Jian Tang,et al.  Enhancing Effectiveness of Outlier Detections for Low Density Patterns , 2002, PAKDD.

[13]  P. Whittle Hypothesis testing in time series analysis , 1954 .

[14]  Ryan P. Adams,et al.  Bayesian Online Changepoint Detection , 2007, 0710.3742.

[15]  Andreas Dengel,et al.  Data Analytics: Industrial Perspective & Solutions for Streaming Data , 2018 .

[16]  Peter Whittle,et al.  Hypothesis Testing in Time Series Analysis. , 1951 .

[17]  Andreas Dengel,et al.  DeepAnT: A Deep Learning Approach for Unsupervised Anomaly Detection in Time Series , 2019, IEEE Access.

[18]  A.J. Conejo,et al.  Day-ahead electricity price forecasting using the wavelet transform and ARIMA models , 2005, IEEE Transactions on Power Systems.

[19]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[20]  Su Fong Chien,et al.  ARIMA Based Network Anomaly Detection , 2010, 2010 Second International Conference on Communication Software and Networks.

[21]  Lovekesh Vig,et al.  Long Short Term Memory Networks for Anomaly Detection in Time Series , 2015, ESANN.

[22]  Qin Yu,et al.  An Improved ARIMA-Based Traffic Anomaly Detection Algorithm for Wireless Sensor Networks , 2016, Int. J. Distributed Sens. Networks.

[23]  Charu C. Aggarwal,et al.  An Introduction to Outlier Analysis , 2013 .

[24]  Nidhi Singh,et al.  Demystifying Numenta anomaly benchmark , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[25]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[26]  Hongnian Yu,et al.  Green IoT: An Investigation on Energy Saving Practices for 2020 and Beyond , 2017, IEEE Access.

[27]  Raghavendra Chalapathy University of Sydney,et al.  Deep Learning for Anomaly Detection: A Survey , 2019, ArXiv.

[28]  Subutai Ahmad,et al.  Unsupervised real-time anomaly detection for streaming data , 2017, Neurocomputing.

[29]  B. Crabtree,et al.  The individual over time: time series applications in health care research. , 1990, Journal of clinical epidemiology.

[30]  Saeed Amizadeh,et al.  Generic and Scalable Framework for Automated Time-series Anomaly Detection , 2015, KDD.

[31]  A. Madansky Identification of Outliers , 1988 .

[32]  Markus Schneider,et al.  Expected similarity estimation for large-scale batch and streaming anomaly detection , 2016, Machine Learning.

[33]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[34]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[35]  Yi Zheng,et al.  Time Series Classification Using Multi-Channels Deep Convolutional Neural Networks , 2014, WAIM.

[36]  Alessandro Beghi,et al.  A data-driven approach for fault diagnosis in HVAC chiller systems , 2015, 2015 IEEE Conference on Control Applications (CCA).

[37]  Je-Won Kang,et al.  Intrusion Detection System Using Deep Neural Network for In-Vehicle Network Security , 2016, PloS one.

[38]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[39]  Alok Kumar Singh Kushwaha and Jagwinder Kaur Dhillon Chandni Deep Learning Trends for Video Based Activity Recognition: A Survey , 2018 .

[40]  Andreas Dengel,et al.  Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm , 2012 .

[41]  Vanish Talwar,et al.  Statistical techniques for online anomaly detection in data centers , 2011, 12th IFIP/IEEE International Symposium on Integrated Network Management (IM 2011) and Workshops.

[42]  Markus Goldstein,et al.  Anomaly Detection in Large Datasets , 2014 .

[43]  Jaime Lloret,et al.  Network Traffic Classifier With Convolutional and Recurrent Neural Networks for Internet of Things , 2017, IEEE Access.

[44]  M. Shyu,et al.  A Novel Anomaly Detection Scheme Based on Principal Component Classifier , 2003 .

[45]  Mahmood Fathy,et al.  Deep-anomaly: Fully convolutional neural network for fast anomaly detection in crowded scenes , 2016, Comput. Vis. Image Underst..

[46]  J. Contreras,et al.  ARIMA Models to Predict Next-Day Electricity Prices , 2002, IEEE Power Engineering Review.

[47]  Anthony K. H. Tung,et al.  Ranking Outliers Using Symmetric Neighborhood Relationship , 2006, PAKDD.

[48]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.