Non-negative matrix factorization with linear constraints for single-channel speech enhancement

This paper investigates a non-negative matrix factorization (NMF)-based approach to the semi-supervised single-channel speech enhancement problem where only non-stationary additive noise signals are given. The proposed method relies on sinusoidal model of speech production which is integrated inside NMF framework using linear constraints on dictionary atoms. This method is further developed to regularize harmonic amplitudes. Simple multiplicative algorithms are presented. The experimental evaluation was made on TIMIT corpus mixed with various types of noise. It has been shown that the proposed method outperforms some of the state-of-the-art noise suppression techniques in terms of signal-to-noise ratio.

[1]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[2]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[3]  J. Larsen,et al.  Wind Noise Reduction using Non-Negative Sparse Coding , 2007, 2007 IEEE Workshop on Machine Learning for Signal Processing.

[4]  Yi Hu,et al.  Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Bhiksha Raj,et al.  Speech denoising using nonnegative matrix factorization with priors , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Jérôme Idier,et al.  Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.

[7]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[8]  Mikkel N. Schmidt,et al.  Single-channel speech separation using sparse non-negative matrix factorization , 2006, INTERSPEECH.

[9]  D. A. van Leeuwen,et al.  Speech and Audio Signal Processing , 2011 .

[10]  Carla Teixeira Lopes,et al.  TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .

[11]  Paris Smaragdis,et al.  A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[13]  Emmanuel Vincent,et al.  Fast bayesian nmf algorithms enforcing harmonicity and temporal continuity in polyphonic music transcription , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[14]  Jérôme Idier,et al.  Algorithms for nonnegative matrix factorization with the beta-divergence , 2010, ArXiv.

[15]  Arne Leijon,et al.  A new approach for speech enhancement based on a constrained Nonnegative Matrix Factorization , 2011, 2011 International Symposium on Intelligent Signal Processing and Communications Systems (ISPACS).

[16]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[17]  Stefan Goetze,et al.  Reduction of Non-stationary Noise for a Robotic Living Assistant using Sparse Non-negative Matrix Factorization , 2012, SMIAE@ACL.