STAD-FEBTE, a shallow and supervised framework for time series anomaly detection by automatic feature engineering, balancing, and tree-based ensembles: An industrial case study

Modern industrial systems are equipped with multi-sensor units, and building anomaly detection modules to monitor their collected data has become a vital task. Missing such abnormal patterns may cause producing faulty products, unwanted shutdowns in the production line, or even catastrophic damages. Sensor measurements of different natures with different sampling frequencies build a multivariate heterogeneous time series data. Conventional machine learning models fail to capture the temporal characteristics of such data. Deep learning models can address this thanks to their internal network architecture, yet training such models requires large datasets with adequate samples from all anomaly classes. This is not the case in real-world problems where class imbalance is a major issue. Tree-based ensembles are reported to have the dominant performance when dealing with structured tabular data. Inspired by this, we propose a supervised framework that combines an automatic feature engineering pipeline converting the time series dataset into its tabular counterpart with tree-based ensembles. The suggested method tackles class imbalance by generating synthetic anomalies using balancing techniques. Moreover, it allows handling heterogeneous multivariate data and augmenting categorical features with sensor measurements. Two real-world industrial datasets of relatively small size from robotized screwing processes are benchmarked, showing better results for the suggested framework compared to commonly used deep learning architectures.

[1]  Dandan Peng,et al.  A Multihead ConvLSTM for Time Series Classification in eHealth Industry 4.0 , 2022, Wireless Communications and Mobile Computing.

[2]  Junchi Yan,et al.  Transformers in Time Series: A Survey , 2022, IJCAI.

[3]  Kevin I-Kai Wang,et al.  Expect the Unexpected: Unsupervised Feature Selection for Automated Sensor Anomaly Detection , 2021, IEEE Sensors Journal.

[4]  Alexandros Iosifidis,et al.  Detecting Faults during Automatic Screwdriving: A Dataset and Use Case of Anomaly Detection for Automatic Screwdriving , 2021, Towards Sustainable Customization: Bridging Smart Products and Manufacturing Systems.

[5]  Xiuzhen Cheng,et al.  Learning Graph Structures With Transformer for Multivariate Time-Series Anomaly Detection in IoT , 2021, IEEE Internet of Things Journal.

[6]  Thomas G. Dietterich,et al.  A Unifying Review of Deep and Shallow Anomaly Detection , 2020, Proceedings of the IEEE.

[7]  S. Cronin,et al.  Automatic precursor recognition and real-time forecasting of sudden explosive volcanic eruptions at Whakaari, New Zealand , 2020, Nature Communications.

[8]  Lu Liu,et al.  Rethinking 1D-CNN for Time Series Classification: A Stronger Baseline , 2020, ArXiv.

[9]  Jose A. Lozano,et al.  A Review on Outlier/Anomaly Detection in Time Series Data , 2020, ACM Comput. Surv..

[10]  Akbar Siami Namin,et al.  The Performance of LSTM and BiLSTM in Forecasting Time Series , 2019, 2019 IEEE International Conference on Big Data (Big Data).

[11]  T. Besier,et al.  Feature engineering workflow for activity recognition from synchronized inertial measurement units , 2019, ACPR Workshops.

[12]  Enrique Onieva,et al.  Multi-head CNN-RNN for multi-time series anomaly detection: An industrial case study , 2019, Neurocomputing.

[13]  Nick S. Jones,et al.  catch22: CAnonical Time-series CHaracteristics , 2019, Data Mining and Knowledge Discovery.

[14]  Andreas W. Kempa-Liehr,et al.  Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh - A Python package) , 2018, Neurocomputing.

[15]  Jinhai Liu,et al.  Markov chain-based feature extraction for anomaly detection in time series and its industrial application , 2018, 2018 Chinese Control And Decision Conference (CCDC).

[16]  Wenfeng Li,et al.  Optimizing multi-sensor deployment via ensemble pruning for wearable activity recognition , 2018, Inf. Fusion.

[17]  Sungzoon Cho,et al.  Squeezed Convolutional Variational AutoEncoder for unsupervised anomaly detection in edge device industrial Internet of Things , 2017, 2018 International Conference on Information and Computer Technologies (ICICT).

[18]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[19]  Dong Yue,et al.  Hierarchical Time Series Feature Extraction for Power Consumption Anomaly Detection , 2017, LSMS/ICSEE.

[20]  Anna Veronika Dorogush,et al.  CatBoost: unbiased boosting with categorical features , 2017, NeurIPS.

[21]  John Cristian Borges Gamboa,et al.  Deep Learning for Time-Series Analysis , 2017, ArXiv.

[22]  Andreas W. Kempa-Liehr,et al.  Distributed and parallel time series feature extraction for industrial big data applications , 2016, ArXiv.

[23]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[24]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[25]  Jean Paul Barddal,et al.  A Survey on Feature Drift Adaptation , 2015, 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI).

[26]  Samuel H. Huang Supervised feature selection: A tutorial , 2015, Artif. Intell. Res..

[27]  Max A. Little,et al.  Highly comparative time-series analysis: the empirical structure of time series and their methods , 2013, Journal of The Royal Society Interface.

[28]  Gilles Louppe,et al.  Ensembles on Random Patches , 2012, ECML/PKDD.

[29]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[30]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[31]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[32]  Gustavo E. A. P. A. Batista,et al.  Class Imbalances versus Class Overlapping: An Analysis of a Learning System Behavior , 2004, MICAI.

[33]  L. Breiman Random Forests , 2001, Encyclopedia of Machine Learning and Data Mining.

[34]  Kaspar Althoefer,et al.  Theoretical modelling of the self-tapping screw fastening process , 2001 .

[35]  Ibrahim A. Hameed,et al.  A Review of Time-Series Anomaly Detection Techniques: A Step to Future Perspectives , 2021 .

[36]  Ravid Shwartz-Ziv,et al.  Tabular Data: Deep Learning is Not All You Need , 2021 .

[37]  THE COMPARISON OF THE KNOWN MODELS OF SELF-TAPPING SCREW JOINTS , 2017 .

[38]  B. Priya,et al.  A Review of Dimensionality Reduction Techniques , 2015 .

[39]  Trevor Hastie,et al.  Multi-class AdaBoost ∗ , 2009 .

[40]  Lior Rokach,et al.  Ensemble learning: A survey , 2018, WIREs Data Mining Knowl. Discov..

[41]  Nam H. Nguyen,et al.  Submitted to Ieee Transactions on Signal Processing 1 Collaborative Multi-sensor Classification via Sparsity-based Representation , 2022 .