Time-frequency analysis for audio event detection in real scenarios

We propose a sound analysis system for the detection of audio events in surveillance applications. The method that we propose combines short- and long-time analysis in order to increase the reliability of the detection. The basic idea is that a sound is composed of small, atomic audio units and some of them are distinctive of a particular class of sounds. Similarly to the words in a text, we count the occurrence of audio units for the construction of a feature vector that describes a given time interval. A classifier is then used to learn which audio units are distinctive for the different classes of sound. We compare the performance of different sets of short-time features by carrying out experiments on the MIVIA audio event data set. We study the performance and the stability of the proposed system when it is employed in live scenarios, so as to characterize its expected behavior when used in real applications.

[1]  Huy Phan,et al.  Audio phrases for audio event recognition , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[2]  Hermann Ney,et al.  Gammatone Features and Feature Combination for Large Vocabulary Speech Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[3]  Chloé Clavel,et al.  Events Detection for an Audio-Based Surveillance System , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[4]  Nikos Fakotakis,et al.  Probabilistic Novelty Detection for Acoustic Surveillance Under Real-World Conditions , 2011, IEEE Transactions on Multimedia.

[5]  Mark J. Shensa,et al.  The discrete wavelet transform: wedding the a trous and Mallat algorithms , 1992, IEEE Trans. Signal Process..

[6]  Nikos Fakotakis,et al.  An Adaptive Framework for Acoustic Monitoring of Potential Hazards , 2009, EURASIP J. Audio Speech Music. Process..

[7]  Vittorio Murino,et al.  Audio Surveillance , 2014, ACM Comput. Surv..

[8]  Juan José Burred,et al.  Audio event detection based on layered symbolic sequence representations , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Cédric Richard,et al.  Abnormal events detection using unsupervised One-Class SVM - Application to audio surveillance and evaluation - , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[10]  Thierry Bertin-Mahieux,et al.  On the Use of Sparce Time Relative Auditory Codes for Music , 2008, ISMIR.

[11]  Asma Rabaoui,et al.  Using One-Class SVMs and Wavelets for Audio Surveillance , 2008, IEEE Transactions on Information Forensics and Security.

[12]  Nicolai Petkov,et al.  Reliable detection of audio events in highly noisy environments , 2015, Pattern Recognit. Lett..

[13]  Manuele Bicego,et al.  Audio-Visual Event Recognition in Surveillance Video Sequences , 2007, IEEE Transactions on Multimedia.

[14]  Alessia Saggese,et al.  Cascade classifiers trained on gammatonegrams for reliably detecting audio events , 2014, 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[15]  Gernot A. Fink,et al.  Temporal Acoustic Words for Online Acoustic Event Detection , 2015, GCPR.

[16]  Dan Istrate,et al.  Sound Detection and Classification for Medical Telesurvey , 2004 .

[17]  R. Patterson,et al.  Complex Sounds and Auditory Images , 1992 .

[18]  Jérôme Louradour,et al.  Audio Events Detection in Public Transport Vehicle , 2006, 2006 IEEE Intelligent Transportation Systems Conference.

[19]  Richard Kronland-Martinet,et al.  A real-time algorithm for signal analysis with the help of the wavelet transform , 1989 .

[20]  Alessia Saggese,et al.  An Ensemble of Rejecting Classifiers for Anomaly Detection of Audio Events , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[21]  Alessia Saggese,et al.  Exploiting the deep learning paradigm for recognizing human actions , 2014, 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[22]  Augusto Sarti,et al.  Scream and gunshot detection and localization for audio-surveillance systems , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[23]  Nicolai Petkov,et al.  Audio Surveillance of Roads: A System for Detecting Anomalous Sounds , 2016, IEEE Transactions on Intelligent Transportation Systems.