论文信息 - SPIDERnet: Attention Network For One-Shot Anomaly Detection In Sounds

SPIDERnet: Attention Network For One-Shot Anomaly Detection In Sounds

We propose a similarity function for one-shot anomaly detection in sounds (ADS) called SPecific anomaly IDentifiER network (SPI- DERnet). In ADS systems, since overlooking an anomaly may re- sult in serious incidents, we need to update such systems using an (often only one) overlooked anomalous sample. A previous study proposed the use of memory-based one-shot learning. A problem with this previous method is that it can detect only short anomalous sounds such as collision sounds because its similarity function is based on a naive mean-squared-error between the input and memo- rized spectrogram. To detect various anomalous sounds, SPIDERnet consists of (i) a neural network-based feature extractor for measur- ing similarity in embedded space and (ii) attention mechanisms for absorbing time-frequency stretching. Experimental results on two public datasets indicate that SPIDERnet outperforms conventional methods and robustly detects various anomalous sounds.

[1] Spyridon Matsoukas,et al. Semi-supervised Acoustic Event Detection Based on Tri-training , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2] Yuma Koizumi,et al. Unsupervised Detection of Anomalous Sound Based on Deep Learning and the Neyman–Pearson Lemma , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[3] Chloé Clavel,et al. Events Detection for an Audio-Based Surveillance System , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[4] Noboru Harada,et al. AdaFlow: Domain-adaptive Density Estimator with Application to Anomaly Detection and Unpaired Cross-domain Translation , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5] Nikos Fakotakis,et al. Probabilistic Novelty Detection for Acoustic Surveillance Under Real-World Conditions , 2011, IEEE Transactions on Multimedia.

[6] Erik Marchi,et al. A novel approach for automatic acoustic novelty detection using a denoising autoencoder with bidirectional LSTM neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7] Hjalmar S. Kühl,et al. Assessing the performance of a semi‐automated acoustic monitoring system for primates , 2015 .

[8] VARUN CHANDOLA,et al. Anomaly detection: A survey , 2009, CSUR.

[9] Kyogu Lee,et al. Rare Sound Event Detection Using 1D Convolutional Recurrent Neural Networks , 2017, DCASE.

[10] Yohei Kawaguchi,et al. How can we detect anomalies from subsampled audio signals? , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).

[11] Noboru Harada,et al. Batch Uniformization for Minimizing Maximum Anomaly Score of Dnn-Based Anomaly Detection in Sounds , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[12] Mark D. Plumbley,et al. Weakly Labelled AudioSet Tagging With Attention Neural Networks , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[13] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[14] Xavier Serra,et al. Training Neural Audio Classifiers with Few Data , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15] Nicolai Petkov,et al. Audio Surveillance of Roads: A System for Detecting Anomalous Sounds , 2016, IEEE Transactions on Intelligent Transportation Systems.

[16] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[17] Kou Tanaka,et al. ATTS2S-VC: Sequence-to-sequence Voice Conversion with Attention and Context Preservation Mechanisms , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18] Yuma Koizumi,et al. ToyADMOS: A Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[19] Takahiro Hara,et al. Inspection of Visible and Invisible Features of Objects with Image and Sound Signal Processing , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20] Augusto Sarti,et al. Scream and gunshot detection and localization for audio-surveillance systems , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[21] Noboru Harada,et al. SNIPER: Few-shot Learning for Anomaly Detection to Minimize False-negative Rate with Ensured True-positive Rate , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22] Raghavendra Chalapathy University of Sydney,et al. Deep Learning for Anomaly Detection: A Survey , 2019, ArXiv.

[23] Sungzoon Cho,et al. Variational Autoencoder based Anomaly Detection using Reconstruction Probability , 2015 .

[24] Yohei Kawaguchi,et al. MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection , 2019, DCASE.

[25] Noboru Harada,et al. Complementary Set Variational Autoencoder for Supervised Anomaly Detection , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[26] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[27] Hideyuki Tachibana,et al. Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28] Xiaofei Wang,et al. A Comparative Study on Transformer vs RNN in Speech Applications , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[29] Sanjiv Kumar,et al. On the Convergence of Adam and Beyond , 2018 .