Acoustic Events Processing with Deep Neural Network

Safety is one of the society requirement, what we need for cheerful live. The principal purpose is to recognize potentially dangerous acoustic events (gun shooting and glass breaking). This document compares a Neural Network (NN) based on the detection system and a hidden Markov model based on the acoustic event detector. For both methods, the same database was used. The database consisted of shots, glass breaks and background noise. Proposed deep neural network processes an acoustic signal through two hidden layers. The whole process may divide into three parts. Training, testing and evaluation part. As the main resulting parameter accuracy has been chosen. This computation process uses a confusion matrix for reliable detection. Accuracy is compared with previous research in this area, as well.

[1]  Heikki Huttunen,et al.  Polyphonic sound event detection using multi label deep neural networks , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[2]  Milan Sigmund,et al.  Acoustical detection of gunshots , 2015, 2015 25th International Conference Radioelektronika (RADIOELEKTRONIKA).

[3]  Matús Pleva,et al.  TUKE-BNews-SK: Slovak Broadcast News Corpus Construction and Evaluation , 2014, LREC.

[4]  Tuomas Virtanen,et al.  Acoustic event detection in real life recordings , 2010, 2010 18th European Signal Processing Conference.

[5]  Nicolas Brunel,et al.  Adequate input for learning in attractor neural networks , 1993 .

[6]  Jozef Juhár,et al.  Acoustic Events Detection Using MFCC and MPEG-7 Descriptors , 2011, MCSS.

[7]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[8]  Luc Van Gool,et al.  Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Detection , 2016 .

[9]  Steve Young,et al.  The HTK book , 1995 .

[10]  Martin Hrabina Analysis of linear predictive coefficients for gunshot detection based on neural networks , 2017, 2017 IEEE 26th International Symposium on Industrial Electronics (ISIE).

[11]  Tara N. Sainath,et al.  FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[12]  Heikki Huttunen,et al.  Recurrent neural networks for polyphonic sound event detection in real life recordings , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Nikos Fakotakis,et al.  On acoustic surveillance of hazardous situations , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  Toan H. Vu,et al.  ACOUSTIC SCENE AND EVENT RECOGNITION USING RECURRENT NEURAL NETWORKS , 2016 .

[15]  Shin'ichi Tamura,et al.  Capabilities of a four-layered feedforward neural network: four layers versus three , 1997, IEEE Trans. Neural Networks.

[16]  Andrey Temko,et al.  Acoustic Event Detection and Classification , 2007, Computers in the Human Interaction Loop.