Multi-microphone acoustic events detection and classification for indoor monitoring

Digital processing of the information provided by the real world is fundamental for designing automated solutions for daily tasks. The complex information of real acoustic scenes must be processed to be used for further applications. This information is composed of noise and audio events. In the current paper, a system to classify seven indoor acoustic events is analysed. The detection and classification algorithm is based on the baseline algorithm of DCASE’2016 challenge. The acoustic features consist of MFCC coefficients, delta coefficients and acceleration coefficients. The system includes binary GMM based classifier for each sound event class. Different microphone positions have been tested. Two of these configurations are composed of four microphones, and the other with one microphone. The results of simulations predict a better classification when more than one microphone is used, that is, more than one classifier. This approach, combined with majority voting techniques, reduces classification errors for any acoustic condition. An increase in system robustness against reflection coefficient changes is observed too. Moreover, the results with multi-microphone configurations confirm the improvement, covering a bigger part of space at the same time.

[1]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[2]  Victor Zue,et al.  Speech database development at MIT: Timit and beyond , 1990, Speech Commun..

[3]  Martin Lojka,et al.  Efficient acoustic detector of gunshots and glass breaking , 2015, Multimedia Tools and Applications.

[4]  Arianit Kurti,et al.  A Machine Learning Driven IoT Solution for Noise Classification in Smart Cities , 2018, ArXiv.

[5]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[6]  Roberto Gil-Pita,et al.  Synthetical Enlargement of MFCC Based Training Sets for Emotion Recognition , 2014, FOCS 2014.

[7]  Maximo Cobos,et al.  Analysis of data fusion techniques for multi-microphone audio event detection in adverse environments , 2017, 2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP).

[8]  Wei Zhang,et al.  EM algorithms of Gaussian mixture model and hidden Markov model , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[9]  Juan Rafael Orozco Arroyave,et al.  On-line signature verification using Gaussian Mixture Models and small-sample learning strategies , 2016 .

[10]  Zheng Fang,et al.  Comparison of different implementations of MFCC , 2001 .

[11]  Augusto Sarti,et al.  Scream and gunshot detection and localization for audio-surveillance systems , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[12]  Bogdan Gabrys,et al.  Classifier selection for majority voting , 2005, Inf. Fusion.

[13]  Dan Stowell,et al.  Detection and Classification of Acoustic Scenes and Events , 2015, IEEE Transactions on Multimedia.

[14]  Justin Salamon,et al.  Sound analysis in smart cities , 2018 .

[15]  Björn Schuller,et al.  Wavelets Revisited for the Classification of Acoustic Scenes , 2017, DCASE.

[16]  Oliver Chiu-sing Choy,et al.  An efficient MFCC extraction method in speech recognition , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[17]  Kai Oliver Arras,et al.  Audio-based human activity recognition using Non-Markovian Ensemble Voting , 2012, 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication.

[18]  Tuomas Virtanen,et al.  TUT database for acoustic scene classification and sound event detection , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).

[19]  Thomas Fang Zheng,et al.  Comparison of different implementations of MFCC , 2001, Journal of Computer Science and Technology.

[20]  Lacrimioara Grama,et al.  Recent developments in acoustical signal classification for monitoring , 2017, 2017 5th International Symposium on Electrical and Electronics Engineering (ISEEE).