NMF-based temporal feature integration for acoustic event classification

In this paper, we propose a new front-end for Acoustic Event Classification tasks (AEC) based on the combination of the temporal feature integration technique called Filter Bank Coefficients (FC) and Non-Negative Matrix Factorization (NMF). FC aims to capture the dynamic structure in the short-term features by means of the summarization of the periodogram of each short-term feature dimension in several frequency bands using a predefined filter bank. As the commonly used filter bank has been devised for other tasks (such as music genre classification), it can be suboptimal for AEC. In order to overcome this drawback, we propose an unsupervised method based on NMF for learning the filters which collect the most relevant temporal information in the short-time features for AEC. The experiments show that the features obtained with this method achieve significant improvements in the classification performance of a Support Vector Machine (SVM) based AEC system in comparison with the baseline FC features.

[1]  Bhiksha Raj,et al.  Speech denoising using nonnegative matrix factorization with priors , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[3]  João Paulo da Silva Neto,et al.  Non-speech audio event detection , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Andrey Temko,et al.  Classification of acoustic events using SVM-based clustering schemes , 2006, Pattern Recognit..

[5]  Jimmy Ludeña-Choez,et al.  Speech Denoising Using Non-negative Matrix Factorization with Kullback-Leibler Divergence and Sparseness Constraints , 2012, IberSPEECH.

[6]  Christian Zieger,et al.  An HMM Based System for Acoustic Event Detection , 2007, CLEAR.

[7]  Lars Kai Hansen,et al.  Optimal filtering of dynamics in short-time features for music organization , 2006, ISMIR.

[8]  Björn W. Schuller,et al.  Non-negative matrix factorization as noise-robust feature extractor for speech recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Lars Kai Hansen,et al.  Temporal Feature Integration for Music Genre Classification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Jimmy Ludeña-Choez,et al.  NMF-Based Spectral Analysis for Acoustic Event Classification Tasks , 2013, NOLISP.

[11]  Jeroen Breebaart,et al.  Features for audio and music classification , 2003, ISMIR.

[12]  Francisco J. Valverde-Albacete,et al.  Feature Extraction Assessment for an Acoustic-Event Classification Task Using the Entropy Triangle , 2011, INTERSPEECH.

[13]  Björn W. Schuller,et al.  Semi-supervised learning helps in sound event classification , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Thomas S. Huang,et al.  Real-world acoustic event detection , 2010, Pattern Recognit. Lett..

[15]  Hanseok Ko,et al.  Hierarchical approach for abnormal acoustic event classification in an elevator , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).