A solution to residual noise in speech denoising with sparse representation

As a promising technique, sparse representation has been extensively investigated in signal processing community. Recently, sparse representation is widely used for speech processing in noisy environments; however, many problems need to be solved because of the particularity of speech. One assumption for speech denoising with sparse representation is that the representation of speech over the dictionary is sparse, while that of the noise is dense. Unfortunately, this assumption is not sustained in speech denoising scenario. We find that many noises, e.g., the babble and white noises, are also sparse over the dictionary trained with clean speech, resulting in severe residual noise in sparse enhancement. To solve this problem, we propose a novel residual noise reduction (RNR) method which first finds out the atoms which represents the noise sparely, and then ignores them in the reconstruction of speech. Experimental results show that the proposed method can reduce residual noise substantially.

[1]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[2]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[3]  Tuomas Virtanen,et al.  Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  S. Laughlin,et al.  An Energy Budget for Signaling in the Grey Matter of the Brain , 2001, Journal of cerebral blood flow and metabolism : official journal of the International Society of Cerebral Blood Flow and Metabolism.

[5]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[6]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[7]  Tuomas Virtanen,et al.  Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Joachim M. Buhmann,et al.  Speech enhancement with sparse coding in learned dictionaries , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Jean-Philippe Thiran,et al.  Sparse non-negative decomposition of speech power spectra for formant tracking , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.