An Ensemble Approach to Anomalous Sound Detection Based on Conformer-Based Autoencoder and Binary Classifier Incorporated with Metric Learning

This paper presents an ensemble approach based on two unsupervised anomalous sound detection (ASD) methods for machine condition monitoring under domain-shifted conditions in DCASE 2021 challenge Task 2. The first ASD method is based on a conformerbased sequence-level autoencoder with section ID regression and a self-attention architecture. We utilize the data augmentation techniques such as SpecAugment to boost the performance and combine a simple scorer module for each section and each domain to address the domain shift problem. The second ASD method is based on a binary classification model using metric learning that uses task-irrelevant outliers as pseudo-anomalous data while controlling centroids of normal and outlier data in a feature space. As a countermeasure against the domain shift problem, we perform data augmentation based on Mixup with data from the target domain, resulting in a stable performance for each section. An ensemble approach is applied to each method, and the resulting two ensembled methods are further ensembled to maximize the ASD performance. The results of DCASE 2021 challenge Task 2 have demonstrated that our proposed method achieves a harmonic mean of 63.745% of area under the curve (AUC) and partial AUC (p = 0.1) over all machines, sections, and domains.

[1]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Quoc V. Le,et al.  Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Yohei Kawaguchi,et al.  Description and Discussion on DCASE2020 Challenge Task2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring , 2020, ArXiv.

[5]  Baris Bayram,et al.  Real time detection of acoustic anomalies in industrial processes using sequential autoencoders , 2020, Expert Syst. J. Knowl. Eng..

[6]  Simone Calderara,et al.  Latent Space Autoregression for Novelty Detection , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Rethinking Assumptions in Deep Anomaly Detection , 2020, ArXiv.

[8]  Yu Zhang,et al.  Conformer: Convolution-augmented Transformer for Speech Recognition , 2020, INTERSPEECH.

[9]  Lovekesh Vig,et al.  Long Short Term Memory Networks for Anomaly Detection in Time Series , 2015, ESANN.

[10]  Tomoki Toda,et al.  Anomalous Sound Detection Using a Binary Classification Model and Class Centroids , 2021, 2021 29th European Signal Processing Conference (EUSIPCO).

[11]  Yasunori Ohishi,et al.  ToyADMOS2: Another Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection under Domain Shift Conditions , 2021, DCASE.

[12]  Yohei Kawaguchi,et al.  Anomalous Sound Detection Based on Interpolation Deep Neural Network , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[14]  Yohei Kawaguchi,et al.  Description and Discussion on DCASE 2021 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions , 2021, DCASE.

[15]  Quoc V. Le,et al.  SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition , 2019, INTERSPEECH.

[16]  Yohei Kawaguchi,et al.  MIMII Due: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection with Domain Shifts Due to Changes in Operational and Environmental Conditions , 2021, 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[17]  Tomoki Hayashi,et al.  CONFORMER-BASED ID-AWARE AUTOENCODER FOR UNSUPERVISED ANOMALOUS SOUND DETECTION Technical Report , 2020 .

[18]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[19]  Alexander Binder,et al.  Deep Semi-Supervised Anomaly Detection , 2019, ICLR.

[20]  Gerhard Widmer,et al.  Anomalous Sound Detection as a Simple Binary Classification Problem with Careful Selection of Proxy Outlier Examples , 2020, DCASE.

[21]  Nicholay Topin,et al.  Super-convergence: very fast training of neural networks using large learning rates , 2018, Defense + Commercial Sensing.

[22]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Stefanos Zafeiriou,et al.  ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).