Evaluating the modified viterbi decoder for long-term audio events monitoring task

Our developed solution of intelligent audio surveillance systems is based on modified Viterbi decoder customized for long-term audio-events monitoring application. The classical Viterbi decoder evaluates the hypothesis after detecting the end of the event (utterance) in the front-end module. The end of the event is usually detected using UBM (Universal Background Model) or relative entropy or energy features with adaptable threshold, etc. Our modified decoder does not require assistance from front-end module and the hypothesis are evaluated upon our proposed algorithm by decoder. This paper describes the evaluation experiment which should approve that the decoder modification will have no significant impact on audio events recognition accuracy.

[1]  Jozef Vavrek,et al.  Broadcast news audio classification using SVM binary trees , 2012, 2012 35th International Conference on Telecommunications and Signal Processing (TSP).

[2]  Fernando Pereira,et al.  Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..

[3]  Nikos Fakotakis,et al.  On acoustic surveillance of hazardous situations , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Jozef Juhár,et al.  Acoustic Events Detection Using MFCC and MPEG-7 Descriptors , 2011, MCSS.

[5]  Jörn Anemüller,et al.  Audio Classification and Localization for Incongruent Event Detection , 2012, Detection and Identification of Rare Audiovisual Cues.

[6]  Ming Liu,et al.  HMM-Based Acoustic Event Detection with AdaBoost Feature Selection , 2007, CLEAR.

[7]  Andrzej Czyzewski,et al.  Dangerous Sound Event Recognition Using Support Vector Machine Classifiers , 2010, MISSI.

[8]  Mohan S. Kankanhalli,et al.  Audio Based Event Detection for Multimedia Surveillance , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[9]  Augusto Sarti,et al.  Scream and gunshot detection and localization for audio-surveillance systems , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[10]  Andrzej Czyzewski,et al.  Gaze-tracking and Acoustic Vector Sensors Technologies for PTZ Camera Steering and Acoustic Event Detection , 2010, 2010 Workshops on Database and Expert Systems Applications.

[11]  F. Sattar,et al.  Automatic event detection for long-term monitoring of hydrophone data , 2011, Proceedings of 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing.

[12]  Mark Liberman,et al.  Transcriber: Development and use of a tool for assisting speech corpora production , 2001, Speech Commun..

[13]  Sergios Theodoridis,et al.  Violence Content Classification Using Audio Features , 2006, SETN.