Estimating the electrical power output of industrial devices with end-to-end time-series classification in the presence of label noise

In complex industrial settings, it is common practice to monitor the operation of machines in order to detect undesired states, adjust maintenance schedules, optimize system performance or collect usage statistics of individual machines. In this work, we focus on estimating the power output of a Combined Heat and Power (CHP) machine of a medium-sized company facility by analyzing the total facility power consumption. We formulate the problem as a time-series classification problem, where the class label represents the CHP power output. As the facility is fully instrumented and sensor measurements from the CHP are available, we generate the training labels in an automated fashion from the CHP sensor readings. However, sensor failures result in mislabeled training data samples which are hard to detect and remove from the dataset. Therefore, we propose a novel multi-task deep learning approach that jointly trains a classifier and an autoencoder with a shared embedding representation. The proposed approach targets to gradually correct the mislabelled data samples during training in a self-supervised fashion, without any prior assumption on the amount of label noise. We benchmark our approach on several time-series classification datasets and find it to be comparable and sometimes better than state-of-the-art methods. On the real-world use-case of predicting the CHP power output, we thoroughly evaluate the architectural design choices and show that the final architecture considerably increases the robustness of the learning process and consistently beats other recent state-of-the-art algorithms in the presence of unstructured as well as structured label noise.

[1]  Dumitru Erhan,et al.  Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[2]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[3]  Shufang Li,et al.  Semisupervised Multilabel Deep Learning Based Nonintrusive Load Monitoring in Smart Grids , 2020, IEEE Transactions on Industrial Informatics.

[4]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection , 2018, J. Open Source Softw..

[5]  Naveen Kumar Thokala,et al.  Multi-Label Auto-Encoder based Electrical Load Disaggregation , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[6]  Devraj Mandal,et al.  A Novel Self-Supervised Re-labeling Approach for Training with Noisy Labels , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[7]  Jan Bosch,et al.  Data Labeling: An Empirical Investigation into Industrial Challenges and Mitigation Strategies , 2020, PROFES.

[9]  Neil Zeghidour,et al.  Wavesplit: End-to-End Speech Separation by Speaker Clustering , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[10]  Chao Zhang,et al.  Self-Adaptive Training: Bridging the Supervised and Self-Supervised Learning , 2021, ArXiv.

[11]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[12]  Vangelis Metsis,et al.  Identifying label noise in time-series datasets , 2020, UbiComp/ISWC Adjunct.

[13]  Junnan Li,et al.  DivideMix: Learning with Noisy Labels as Semi-supervised Learning , 2020, ICLR.

[14]  Yoshua Bengio,et al.  A Closer Look at Memorization in Deep Networks , 2017, ICML.

[15]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[17]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[18]  Bin Yang,et al.  Toward a semi-supervised non-intrusive load monitoring system for event-based energy disaggregation , 2015, 2015 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[19]  Dawn Song,et al.  Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty , 2019, NeurIPS.

[20]  Shuang Gao,et al.  Self-semi-supervised Learning to Learn from NoisyLabeled Data , 2020, ArXiv.

[21]  Patrick E. McKnight,et al.  Mann‐Whitney U Test , 2010 .

[22]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[23]  Aditya Krishna Menon,et al.  Learning with Symmetric Label Noise: The Importance of Being Unhinged , 2015, NIPS.

[24]  Thomas Brox,et al.  SELF: Learning to Filter Noisy Labels with Self-Ensembling , 2019, ICLR.

[25]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[26]  Lars Schmidt-Thieme,et al.  Self-supervised Learning for Semi-supervised Time Series Classification , 2020, PAKDD.

[27]  Tim Oates,et al.  Time series classification from scratch with deep neural networks: A strong baseline , 2016, 2017 International Joint Conference on Neural Networks (IJCNN).

[28]  Germain Forestier,et al.  Deep learning for time series classification: a review , 2018, Data Mining and Knowledge Discovery.

[29]  Xingrui Yu,et al.  SIGUA: Forgetting May Make Learning with Noisy Labels More Robust , 2018, ICML.

[30]  Kaizhu Huang,et al.  Manifold adversarial training for supervised and semi-supervised learning , 2021, Neural Networks.

[31]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[32]  Jun Liang,et al.  A Label Noise Robust Stacked Auto-Encoder Algorithm for Inaccurate Supervised Classification Problems , 2019 .

[33]  Noel E. O'Connor,et al.  Unsupervised label noise modeling and loss correction , 2019, ICML.

[34]  Daniel P. W. Ellis,et al.  Learning Sound Event Classifiers from Web Audio with Noisy Labels , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[35]  Xue Ben,et al.  Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case , 2020, ArXiv.

[36]  Steffen Limmer,et al.  Evaluation of Optimization-Based EV Charging Scheduling with Load Limit in a Realistic Scenario , 2019, Energies.

[37]  R. Venkatesha Prasad,et al.  UniversalNILM: A Semi-supervised Energy Disaggregation Framework using General Appliance Models , 2018, e-Energy.

[38]  O. P. Gan Automatic Labeling For Personalized IoT Wearable Monitoring , 2018, IECON 2018 - 44th Annual Conference of the IEEE Industrial Electronics Society.

[39]  UniversalNILM , 2018, Proceedings of the Ninth International Conference on Future Energy Systems.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Nir Shavit,et al.  Deep Learning is Robust to Massive Label Noise , 2017, ArXiv.

[42]  Simon K. Warfield,et al.  Deep learning with noisy labels: exploring techniques and remedies in medical image analysis , 2020, Medical Image Anal..

[43]  Houshang Darabi,et al.  LSTM Fully Convolutional Networks for Time Series Classification , 2017, IEEE Access.

[44]  Mert R. Sabuncu,et al.  Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels , 2018, NeurIPS.

[45]  Chen Wang,et al.  Time Series Data Cleaning: A Survey , 2020, IEEE Access.

[46]  Dragos Gavrilut,et al.  Dealing with Class Noise in Large Training Datasets for Malware Detection , 2011, 2011 13th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing.

[47]  Jae-Gil Lee,et al.  Learning from Noisy Labels with Deep Neural Networks: A Survey , 2020, ArXiv.

[48]  Pengfei Chen,et al.  Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels , 2019, ICML.

[49]  Cordelia Schmid,et al.  Spreading vectors for similarity search , 2018, ICLR.

[50]  Li Fei-Fei,et al.  MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[51]  Mikkel Baun Kjærgaard,et al.  NILM in an Industrial Setting: A Load Characterization and Algorithm Evaluation , 2016, 2016 IEEE International Conference on Smart Computing (SMARTCOMP).

[52]  Simone Manca,et al.  Non-Intrusive Load Disaggregation by Convolutional Neural Network and Multilabel Classification , 2020, Applied Sciences.

[53]  Xinhua Zhang Empirical Risk Minimization , 2017, Encyclopedia of Machine Learning and Data Mining.

[54]  Xingrui Yu,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.