Modification of widely used feature vectors for real-time acoustic events detection

Besides video surveillance system for monitoring large urban areas also the acoustic events detection system can be used. The acoustic detection system is monitoring potentially dangerous sounds and in case of detection an alarm is produced. We developed our own approach to the acoustic events detection system with modified Viterbi decoder operating over HMM (Hidden Markov Models) especially adapted for long-term monitoring task and our own MFCC (Mel-Frequency Cepstral Coeff.) extraction module. In this paper we evaluate our system on new testing database simulating change of environment SNR (Signal-to-Noise Ratio) and also influence of CMN (Cepstral Mean Subtraction) on the detection accuracy. By this occasion we also introduce new modification to our Viterbi decoder. We implemented feature reduction mechanism to omit configurable number of MFC coefficients of input feature vector from decoding process without retraining the HMM models. Results in this paper describe that reduction of feature vector to only delta and acceleration coefficients are improving detection accuracy of our system. We also show in this paper that no CMN is required in front-end even when acoustic model trained with CMN is used.

[1]  F. Sattar,et al.  Automatic event detection for long-term monitoring of hydrophone data , 2011, Proceedings of 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing.

[2]  Józef Kotus Multiple Sound Sources Localization in Real Time Using Acoustic Vector Sensor , 2012, MCSS.

[3]  Martin Lojka,et al.  Comparison of Different Feature Types for Acoustic Event Detection System , 2013, MCSS.

[4]  J. Juhar,et al.  Evaluating the modified viterbi decoder for long-term audio events monitoring task , 2012, Proceedings ELMAR-2012.

[5]  Jörn Anemüller,et al.  Audio Classification and Localization for Incongruent Event Detection , 2012, Detection and Identification of Rare Audiovisual Cues.

[6]  Jozef Vavrek,et al.  Broadcast news audio classification using SVM binary trees , 2012, 2012 35th International Conference on Telecommunications and Signal Processing (TSP).

[7]  Augusto Sarti,et al.  Scream and gunshot detection and localization for audio-surveillance systems , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[8]  Andrzej Czyzewski,et al.  Automatic Regular Voice, Raised Voice, and Scream Recognition Employing Fuzzy Logic , 2012 .

[9]  Andrzej Czyzewski,et al.  Gaze-tracking and Acoustic Vector Sensors Technologies for PTZ Camera Steering and Acoustic Event Detection , 2010, 2010 Workshops on Database and Expert Systems Applications.

[10]  Andrzej Czyzewski,et al.  Detection and localization of selected acoustic events in acoustic field for smart surveillance applications , 2012, Multimedia Tools and Applications.