Real-Time Recognition of Motor Vehicle Whistle with Convolutional Neural Network

This paper proposes a method based on convolutional neural network (CNN) to recognition of motor vehicle whistle, which is used to monitor illegal whistle. The convolutional neural network architecture takes the spectrum as input and infers through the trained convolutional network to determine whether whistled. We achieve a recognition accuracy of 99% on the whistle data collected by China Orient Institute of Noise & Vibration. The convolutional neural network consists of two layers of convolution and two layers of full connections. The time of single inference is less than 3 ms, which can used to monitor the whistle in real time.

[1]  Hiroyuki Kasai,et al.  NMF-based environmental sound source separation using time-variant gain features , 2012, Comput. Math. Appl..

[2]  Luc Van Gool,et al.  Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Detection , 2016, ArXiv.

[3]  Chng Eng Siong,et al.  Overlapping sound event recognition using local spectrogram features and the generalised hough transform , 2013, Pattern Recognit. Lett..

[4]  Florian Metze,et al.  Audio-based multimedia event detection using deep recurrent neural networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  Tuomas Virtanen,et al.  Acoustic event detection in real life recordings , 2010, 2010 18th European Signal Processing Conference.

[6]  Moncef Gabbouj,et al.  Supervised model training for overlapping sound events based on unsupervised source separation , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Yan Song,et al.  Robust Sound Event Classification Using Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[8]  Nicolai Petkov,et al.  Reliable detection of audio events in highly noisy environments , 2015, Pattern Recognit. Lett..

[9]  Justin Salamon,et al.  Feature learning with deep scattering for urban sound analysis , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[10]  Guillaume Lemaitre,et al.  Real-Time Detection of Overlapping Sound Events with Non-Negative Matrix Factorization , 2013 .

[11]  Dan Stowell,et al.  Acoustic event detection for multiple overlapping similar sources , 2015, 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[12]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[13]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[14]  Stephan Gerlach,et al.  Acoustic Monitoring and Localization for Social Care , 2012, J. Comput. Sci. Eng..

[15]  Zhi-Hua Zhou,et al.  Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization , 2006, IEEE Transactions on Knowledge and Data Engineering.

[16]  Onur Dikmen,et al.  Sound event detection using non-negative dictionaries learned from annotated overlapping events , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.