Improving event detection for audio surveillance using Gabor filterbank features

Acoustic event detection in surveillance scenarios is an important but difficult problem. Realistic systems are struggling with noisy recording conditions. In this work, we propose to use Gabor filterbank features to detect target events in different noisy background scenes. These features capture spectro-temporal modulation frequencies in the signal, which makes them suited for the detection of non-stationary sound events. A single-class detector is constructed for each of the different target events. In a hierarchical framework, the separate detectors are combined to a multi-class detector. Experiments are performed using a database of four different target sounds and four background scenarios. On average, the proposed features outperform conventional features in all tested noise levels, in terms of detection and classification performance.

[1]  Allan D. Pierce,et al.  Acoustics , 1989 .

[2]  Alessia Saggese,et al.  Audio surveillance using a bag of aural words classifier , 2013, 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[3]  Nikos Fakotakis,et al.  On acoustic surveillance of hazardous situations , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  C. Schreiner,et al.  Gabor analysis of auditory midbrain receptive fields: spectro-temporal and binaural composition. , 2003, Journal of neurophysiology.

[5]  Justin Salamon,et al.  A Dataset and Taxonomy for Urban Sound Research , 2014, ACM Multimedia.

[6]  Asma Rabaoui,et al.  Using One-Class SVMs and Wavelets for Audio Surveillance , 2008, IEEE Transactions on Information Forensics and Security.

[7]  Birger Kollmeier,et al.  On the use of spectro-temporal features for the IEEE AASP challenge ‘detection and classification of acoustic scenes and events’ , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[8]  Haizhou Li,et al.  Spectrogram Image Feature for Sound Event Classification in Mismatched Conditions , 2011, IEEE Signal Processing Letters.

[9]  Pablo Alvarado-Moya,et al.  Evaluation of gunshot detection algorithms , 2008, 2008 Argentine School of Micro-Nanoelectronics, Technology and Applications.

[10]  Tuomas Virtanen,et al.  Acoustic event detection in real life recordings , 2010, 2010 18th European Signal Processing Conference.

[11]  Athanasios Mouchtaris,et al.  23rd European Signal Processing Conference (EUSIPCO'15), Nice, 31 Aug - 4 Sep, 2015 , 2015 .

[12]  Dan Stowell,et al.  Detection and classification of acoustic scenes and events: An IEEE AASP challenge , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[13]  Chloé Clavel,et al.  Events Detection for an Audio-Based Surveillance System , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[14]  B. Kollmeier,et al.  Spectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition. , 2012, The Journal of the Acoustical Society of America.

[15]  Fausto Pellandini,et al.  Automatic sound detection and recognition for noisy environment , 2000, 2000 10th European Signal Processing Conference.

[16]  Björn W. Schuller,et al.  Large-scale audio feature extraction and SVM for acoustic scene classification , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[17]  Vittorio Murino,et al.  Audio Surveillance , 2014, ACM Comput. Surv..

[18]  Augusto Sarti,et al.  Scream and gunshot detection in noisy environments , 2007, 2007 15th European Signal Processing Conference.

[19]  S. Mallat A wavelet tour of signal processing , 1998 .

[20]  Andrey Temko,et al.  CLEAR Evaluation of Acoustic Event Detection and Classification Systems , 2006, CLEAR.