Multi-head CNN-RNN for multi-time series anomaly detection: An industrial case study

Abstract Detecting anomalies in time series data is becoming mainstream in a wide variety of industrial applications in which sensors monitor expensive machinery. The complexity of this task increases when multiple heterogeneous sensors provide information of different nature, scales and frequencies from the same machine. Traditionally, machine learning techniques require a separate data pre-processing before training, which tends to be very time-consuming and often requires domain knowledge. Recent deep learning approaches have shown to perform well on raw time series data, eliminating the need for pre-processing. In this work, we propose a deep learning based approach for supervised multi-time series anomaly detection that combines a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN) in different ways. Unlike other approaches, we use independent CNNs, so-called convolutional heads, to deal with anomaly detection in multi-sensor systems. We address each sensor individually avoiding the need for data pre-processing and allowing for a more tailored architecture for each type of sensor. We refer to this architecture as Multi-head CNN–RNN. The proposed architecture is assessed against a real industrial case study, provided by an industrial partner, where a service elevator is monitored. Within this case study, three type of anomalies are considered: point, context-specific, and collective.The experimental results show that the proposed architecture is suitable for multi-time series anomaly detection as it obtained promising results on the real industrial scenario.

[1]  Antonio Liotta,et al.  Spatial anomaly detection in sensor networks using neighborhood information , 2017, Inf. Fusion.

[2]  Luís Torgo,et al.  Resampling strategies for imbalanced time series forecasting , 2017, International Journal of Data Science and Analytics.

[3]  Subutai Ahmad,et al.  Unsupervised real-time anomaly detection for streaming data , 2017, Neurocomputing.

[4]  Fan Long,et al.  Principled Sampling for Anomaly Detection , 2015, NDSS.

[5]  Jack Beuth,et al.  Anomaly Detection and Classification in a Laser Powder Bed Additive Manufacturing Process using a Trained Computer Vision Algorithm , 2018 .

[6]  Wenfeng Li,et al.  Optimizing multi-sensor deployment via ensemble pruning for wearable activity recognition , 2018, Inf. Fusion.

[7]  Xuelong Li,et al.  A CNN-RNN architecture for multi-label weather recognition , 2018, Neurocomputing.

[8]  Lovekesh Vig,et al.  Long Short Term Memory Networks for Anomaly Detection in Time Series , 2015, ESANN.

[9]  Tak-Chung Fu,et al.  A review on time series data mining , 2011, Eng. Appl. Artif. Intell..

[10]  Sung-Bae Cho,et al.  Web traffic anomaly detection using C-LSTM neural networks , 2018, Expert Syst. Appl..

[11]  Jianping Fan,et al.  Spatial pyramid deep hashing for large-scale image retrieval , 2017, Neurocomputing.

[12]  Michael E. Fitzpatrick,et al.  Detecting anomalies in time series data via a deep learning algorithm combining wavelets, neural networks and Hilbert transform , 2017, Expert Syst. Appl..

[13]  Assefaw H. Gebremedhin,et al.  A Signal-Level Transfer Learning Framework for Autonomous Reconfiguration of Wearable Systems , 2020, IEEE Transactions on Mobile Computing.

[14]  Xuefeng Chen,et al.  Dislocated Time Series Convolutional Neural Architecture: An Intelligent Fault Diagnosis Approach for Electric Machine , 2017, IEEE Transactions on Industrial Informatics.

[15]  Stefan Wermter,et al.  An analysis of Convolutional Long Short-Term Memory Recurrent Neural Networks for gesture recognition , 2017, Neurocomputing.

[16]  Francisco Herrera,et al.  A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability , 2009, Soft Comput..

[17]  Francisco Herrera,et al.  Transforming big data into smart data: An insight on the use of the k‐nearest neighbors algorithm to obtain quality data , 2018, WIREs Data Mining Knowl. Discov..

[18]  Fathi M. Salem,et al.  Gate-variants of Gated Recurrent Unit (GRU) neural networks , 2017, 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS).

[19]  Pavel Filonov,et al.  Multivariate Industrial Time Series with Cyber-Attack Simulation: Fault Detection Using an LSTM-based Predictive Data Model , 2016, ArXiv.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Daniel Roggen,et al.  Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition , 2016, Sensors.

[22]  Hermann Ney,et al.  Cross-entropy vs. squared error training: a theoretical and experimental comparison , 2013, INTERSPEECH.

[23]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Ray Y. Zhong,et al.  Intelligent Manufacturing in the Context of Industry 4.0: A Review , 2017 .

[25]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[26]  Taufik Abrão,et al.  Network Anomaly Detection System using Genetic Algorithm and Fuzzy Logic , 2018, Expert Syst. Appl..

[27]  Andreas Theissler,et al.  Detecting known and unknown faults in automotive systems using ensemble-based anomaly detection , 2017, Knowl. Based Syst..

[28]  See-Kiong Ng,et al.  Integrated Oversampling for Imbalanced Time Series Classification , 2013, IEEE Transactions on Knowledge and Data Engineering.

[29]  Ejaz Ahmed,et al.  Real-time big data processing for anomaly detection: A Survey , 2019, Int. J. Inf. Manag..

[30]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[31]  Yu Cheng,et al.  Unsupervised Sequential Outlier Detection With Deep Architectures , 2017, IEEE Transactions on Image Processing.

[32]  Janusz Kuchmister,et al.  Multi-sensors measuring system for geodetic monitoring of elevator guide rails , 2018, Measurement.

[33]  Francisco Herrera,et al.  A first attempt on global evolutionary undersampling for imbalanced big data , 2017, 2017 IEEE Congress on Evolutionary Computation (CEC).

[34]  Yap-Peng Tan,et al.  Scenario-Based Insider Threat Detection From Cyber Activities , 2018, IEEE Transactions on Computational Social Systems.

[35]  Trac D. Tran,et al.  Collaborative Multi-Sensor Classification Via Sparsity-Based Representation , 2014, IEEE Transactions on Signal Processing.

[36]  Lin Li,et al.  Industrial Big Data in an Industry 4.0 Environment: Challenges, Schemes, and Applications for Predictive Maintenance , 2017, IEEE Access.

[37]  Hongqiang Wang,et al.  Geometric means and medians with applications to target detection , 2017, IET Signal Process..

[38]  Li Zhang,et al.  Application of Synthetic NDVI Time Series Blended from Landsat and MODIS Data for Grassland Biomass Estimation , 2015, Remote. Sens..

[39]  Khashayar Khorasani,et al.  Deep Convolutional Neural Networks and Learning ECG Features for Screening Paroxysmal Atrial Fibrillation Patients , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[40]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[41]  Antonio Coronato,et al.  Gait Anomaly Detection of Subjects With Parkinson’s Disease Using a Deep Time Series-Based Approach , 2018, IEEE Access.

[42]  Seiichi Uchida,et al.  A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data , 2016, PloS one.

[43]  Joan Cabestany,et al.  Deep learning for freezing of gait detection in Parkinson's disease patients in their homes using a waist-worn inertial measurement unit , 2018, Knowl. Based Syst..

[44]  S. García,et al.  An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons , 2008 .

[45]  Volker Lohweg,et al.  Survey on time series motif discovery , 2017, WIREs Data Mining Knowl. Discov..

[46]  Iñaki Inza,et al.  Dealing with the evaluation of supervised classification algorithms , 2015, Artificial Intelligence Review.

[47]  Hashem M. Hashemian,et al.  State-of-the-Art Predictive Maintenance Techniques* , 2011, IEEE Transactions on Instrumentation and Measurement.

[48]  Michele Luvisotto,et al.  Distributed Clustering Strategies in Industrial Wireless Sensor Networks , 2017, IEEE Transactions on Industrial Informatics.

[49]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[50]  Giancarlo Fortino,et al.  Short-long term anomaly detection in wireless sensor networks based on machine learning and multi-parameterized edit distance , 2019, Inf. Fusion.

[51]  Takaya Saito,et al.  The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets , 2015, PloS one.

[52]  Graham V. Weinberg,et al.  Geometric mean switching constant false alarm rate detector , 2017, Digit. Signal Process..

[53]  Nhien-An Le-Khac,et al.  Collective Anomaly Detection Based on Long Short-Term Memory Recurrent Neural Networks , 2016, FDSE.

[54]  Mohsen Guizani,et al.  Internet of Things Architecture: Recent Advances, Taxonomy, Requirements, and Open Challenges , 2017, IEEE Wireless Communications.

[55]  Francisco Herrera,et al.  Learning from Imbalanced Data Sets , 2018, Springer International Publishing.

[56]  Ramón F. Brena,et al.  Multi-view stacking for activity recognition with sound and accelerometer data , 2018, Inf. Fusion.

[57]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[58]  Lovekesh Vig,et al.  Anomaly detection in ECG time signals via deep long short-term memory networks , 2015, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[59]  Nhien-An Le-Khac,et al.  One-Class Collective Anomaly Detection Based on LSTM-RNNs , 2017, Trans. Large Scale Data Knowl. Centered Syst..

[60]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[61]  Wenyu Zhang,et al.  Heterogeneous Sensor Data Fusion By Deep Multimodal Encoding , 2017, IEEE Journal of Selected Topics in Signal Processing.

[62]  Le Zhao,et al.  Brain activation detection by modified neighborhood one-class SVM on fMRI data , 2018, Biomed. Signal Process. Control..

[63]  Yu Peng,et al.  Anomaly detection based on uncertainty fusion for univariate monitoring series , 2017 .

[64]  Cesare Furlanello,et al.  Deep learning for automatic stereotypical motor movement detection using wearable sensors in autism spectrum disorders , 2017, Signal Process..

[65]  Yanmin Qian,et al.  Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[66]  Hui Li,et al.  Computer vision and deep learning–based data anomaly detection method for structural health monitoring , 2019 .

[67]  Haiyang Pan,et al.  Rolling bearing fault detection and diagnosis based on composite multiscale fuzzy entropy and ensemble support vector machines , 2017 .

[68]  Limin Wang,et al.  Knowledge Guided Disambiguation for Large-Scale Scene Classification With Multi-Resolution CNNs , 2016, IEEE Transactions on Image Processing.

[69]  Zhiwei Ji,et al.  Detecting Anomalies in Time Series Data via a Meta-Feature Based Approach , 2018, IEEE Access.

[70]  Andrey Ignatov,et al.  Real-time human activity recognition from accelerometer data using Convolutional Neural Networks , 2018, Appl. Soft Comput..

[71]  Goiuria Sagardui Mendieta,et al.  Product Line Engineering of Monitoring Functionality in Industrial Cyber-Physical Systems: A Domain Analysis , 2017, SPLC.

[72]  Lovekesh Vig,et al.  LSTM-based Encoder-Decoder for Multi-sensor Anomaly Detection , 2016, ArXiv.

[73]  Khalid Benabdeslem,et al.  Unsupervised outlier detection for time series by entropy and dynamic time warping , 2018, Knowledge and Information Systems.

[74]  Basilio Sierra,et al.  Data-driven prognostics using a combination of constrained K-means clustering, fuzzy modeling and LOF-based score , 2017, Neurocomputing.

[75]  Taghi M. Khoshgoftaar,et al.  A survey of transfer learning , 2016, Journal of Big Data.

[76]  Shunming Li,et al.  A New Transfer Learning Method and its Application on Rotating Machine Fault Diagnosis Under Variant Working Conditions , 2018, IEEE Access.

[77]  Sattar Hashemi,et al.  To Combat Multi-Class Imbalanced Problems by Means of Over-Sampling Techniques , 2016, IEEE Transactions on Knowledge and Data Engineering.

[78]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[79]  Eman M. G. Younis,et al.  Deep learning analysis of mobile physiological, environmental and location sensor data for emotion detection , 2019, Inf. Fusion.

[80]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[81]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[82]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[83]  Chao Liu,et al.  An unsupervised spatiotemporal graphical modeling approach for wind turbine condition monitoring , 2018, Renewable Energy.

[84]  Junliang Liu,et al.  Convolutional neural networks for time series classification , 2017 .