Exploring superframe co-occurrence for acoustic event recognition

We introduce in this paper a concept of using acoustic superframes, a mid-level representation which can overcome the drawbacks of both global and simple frame-level representations for acoustic events. Through superframe-level recognition, we explore the phenomenon of superframe co-occurrence across different event categories and propose an efficient classification scheme that takes advantage of this feature sharing to improve the event-wise recognition power. We empirically show that our recognition system results in 2.7% classification error rate on the ITC-Irst database. This state-of-the-art performance demonstrates the efficiency of this proposed approach. Furthermore, we argue that this presentation can pretty much facilitate the event detection task compared to its counterparts, e.g. global and simple frame-level representations.

[1]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[2]  Subhransu Maji,et al.  Efficient Classification for Additive Kernel SVMs , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Mohan S. Kankanhalli,et al.  Audio Based Event Detection for Multimedia Surveillance , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[4]  Thomas S. Huang,et al.  Real-world acoustic event detection , 2010, Pattern Recognit. Lett..

[5]  Andrey Temko,et al.  Acoustic event detection in meeting-room environments , 2009, Pattern Recognit. Lett..

[6]  Tuomas Virtanen,et al.  Acoustic event detection in real life recordings , 2010, 2010 18th European Signal Processing Conference.

[7]  Chng Eng Siong,et al.  Image Feature Representation of the Subband Power Distribution for Robust Sound Event Classification , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Daniel P. W. Ellis,et al.  Detecting local semantic concepts in environmental sounds using Markov model based clustering , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Andrey Temko,et al.  CLEAR Evaluation of Acoustic Event Detection and Classification Systems , 2006, CLEAR.

[10]  Alexander H. Waibel CHIL - Computers in the Human Interaction Loop , 2005, MVA.

[11]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[12]  Andrew Zisserman,et al.  Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  Bhiksha Raj,et al.  Audio event detection from acoustic unit occurrence patterns , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Stefan Goetze,et al.  Detection and Classification of Acoustic Events for In-Home Care , 2011 .

[15]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .