Intelligent Fault Diagnosis With Noisy Labels via Semisupervised Learning on Industrial Time Series

Deep neural networks (DNNs) excel at industrial fault diagnosis. Their performance heavily relies on the quality of human-annotated labels. Due to perception limitations of annotators, industrial time series samples (such as vibration and voltage signals) are frequently mislabeled in several conditions, such as samples with frequency domain feature differences and samples on class borders. Hence, an annotated industrial dataset will inevitably contain noisy labels at a certain level, leading to overfitting and poor generalization of DNNs. In this work, we introduce an industrial noisy label semisupervised learning (INL-SSL) fault diagnosis approach, addressing the problem that a certain number of samples in an industrial dataset are mislabeled. The proposed INL-SSL architecture simultaneously trains two DNNs, which cross-train on each other to filter noisy label errors. In particular, a fitted Gaussian mixture model divides time series samples of each DNN flow into an unlabeled set with samples likely to be noisy and a labeled set with samples likely to be clean. Given the labeled and unlabeled data, we proposed a time series MixMatch semisupervised learning strategy to train the diagnostic model. Ablation study verifies the benefit of the proposed time series augmentation techniques for semisupervised training. Extensive experiments on a benchmark industrial dataset of rolling element bearings (REB) reveal that the INL-SSL outperforms state-of-the-art approaches. On another self-collected REB dataset, the proposed approach also exceeds other comparison methods under noise ratios from 20% to 90%, validating the model's generalizability.

[1]  T. Shinozaki,et al.  FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling , 2021, NeurIPS.

[2]  Erkun Yang,et al.  Understanding and Improving Early Stopping for Learning with Noisy Labels , 2021, NeurIPS.

[3]  Chunhui Zhao,et al.  Fault Description Based Attribute Transfer for Zero-Sample Industrial Fault Diagnosis , 2021, IEEE Transactions on Industrial Informatics.

[4]  Samy Bengio,et al.  Understanding deep learning (still) requires rethinking generalization , 2021, Commun. ACM.

[5]  Hwanjun Song,et al.  Learning From Noisy Labels With Deep Neural Networks: A Survey , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Lina Yao,et al.  A Semisupervised Recurrent Convolutional Attention Model for Human Activity Recognition , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Junnan Li,et al.  DivideMix: Learning with Noisy Labels as Semi-supervised Learning , 2020, ICLR.

[8]  David Berthelot,et al.  FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , 2020, NeurIPS.

[9]  Mohd Salman Leong,et al.  Gearbox Fault Diagnosis Using a Deep Learning Model With Limited Data Sample , 2020, IEEE Transactions on Industrial Informatics.

[10]  Khandakar M. Rashid,et al.  Times-series data augmentation and deep learning for construction equipment activity recognition , 2019, Adv. Eng. Informatics.

[11]  James Bailey,et al.  Symmetric Cross Entropy for Robust Learning With Noisy Labels , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Jae-Gil Lee,et al.  SELFIE: Refurbishing Unclean Samples for Robust Deep Learning , 2019, ICML.

[13]  David Berthelot,et al.  MixMatch: A Holistic Approach to Semi-Supervised Learning , 2019, NeurIPS.

[14]  Xiaojun Chang,et al.  Adaptive Semi-Supervised Feature Selection for Cross-Modal Retrieval , 2019, IEEE Transactions on Multimedia.

[15]  Kun Yi,et al.  Probabilistic End-To-End Noise Correction for Learning With Noisy Labels , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Xingrui Yu,et al.  How does Disagreement Help Generalization against Label Corruption? , 2019, ICML.

[17]  Huan Zhao,et al.  A general end-to-end diagnosis framework for manufacturing systems , 2018, National science review.

[18]  Mohan S. Kankanhalli,et al.  Learning to Learn From Noisy Labeled Data , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Cheng Cheng,et al.  A Deep Learning-Based Remaining Useful Life Prediction Approach for Bearings , 2018, IEEE/ASME Transactions on Mechatronics.

[20]  Wei Zhou,et al.  Data driven discovery of cyber physical systems , 2018, Nature Communications.

[21]  Masashi Sugiyama,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[22]  Qinghua Zheng,et al.  An Adaptive Semisupervised Feature Analysis for Video Semantic Recognition , 2018, IEEE Transactions on Cybernetics.

[23]  Yoshua Bengio,et al.  A Closer Look at Memorization in Deep Networks , 2017, ICML.

[24]  Geoffrey E. Hinton,et al.  Regularizing Neural Networks by Penalizing Confident Output Distributions , 2017, ICLR.

[25]  Richard Nock,et al.  Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  D. Tao,et al.  Classification with Noisy Labels by Importance Reweighting , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[28]  Fabio Gagliardi Cozman,et al.  Semi-Supervised Learning of Mixture Models , 2003, ICML.

[29]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[30]  Kai Zhang,et al.  A fault diagnosis method for wind turbines gearbox based on adaptive loss weighted meta-ResNet under noisy labels , 2021 .

[31]  Accessed from , 2012 .

[32]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[33]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .