NMF-Based Spectral Analysis for Acoustic Event Classification Tasks

In this paper, we propose a new front-end for Acoustic Event Classification tasks (AEC). First, we study the spectral contents of different acoustic events by applying Non-Negative Matrix Factorization (NMF) on their spectral magnitude and compare them with the structure of speech spectra. Second, from the findings of this study, we propose a new parameterization for AEC, which is an extension of the conventional Mel Frequency Cepstrum Coefficients (MFCC) and is based on the high pass filtering of acoustic event spectra. Also, the influence of different frequency scales on the classification rate of the whole system is studied. The evaluation of the proposed features for AEC shows that relative error reductions about 12% at segment level and about 11% at target event level with respect to the conventional MFCC are achieved.

[1]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[2]  Jimmy Ludeña-Choez,et al.  Speech Denoising Using Non-negative Matrix Factorization with Kullback-Leibler Divergence and Sparseness Constraints , 2012, IberSPEECH.

[3]  Andrey Temko,et al.  Classification of acoustic events using SVM-based clustering schemes , 2006, Pattern Recognit..

[4]  Francisco J. Valverde-Albacete,et al.  Feature Extraction Assessment for an Acoustic-Event Classification Task Using the Entropy Triangle , 2011, INTERSPEECH.

[5]  Junichi Yamagishi,et al.  Towards Cross-Lingual Emotion Transplantation , 2014, IberSPEECH.

[6]  Thomas S. Huang,et al.  Real-world acoustic event detection , 2010, Pattern Recognit. Lett..

[7]  Hanseok Ko,et al.  Hierarchical approach for abnormal acoustic event classification in an elevator , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[8]  João Paulo da Silva Neto,et al.  Non-speech audio event detection , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Christian Zieger,et al.  An HMM Based System for Acoustic Event Detection , 2007, CLEAR.

[10]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[11]  Bhiksha Raj,et al.  Speech denoising using nonnegative matrix factorization with priors , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Björn W. Schuller,et al.  Non-negative matrix factorization as noise-robust feature extractor for speech recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[13]  Lars Kai Hansen,et al.  Temporal Feature Integration for Music Genre Classification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Jonathan G. Fiscus,et al.  Multimodal Technologies for Perception of Humans, International Evaluation Workshops CLEAR 2007 and RT 2007, Baltimore, MD, USA, May 8-11, 2007, Revised Selected Papers , 2008, CLEAR.