Speech enhancement with binaural cues derived from a priori codebook

In conventional codebook-driven speech enhancement, only spectral envelopes of speech and noise are considered, and at the same time, the type of noise is the priori information when we enhance the noisy speech. In this paper, we propose a novel codebook-based speech enhancement method which exploits a priori information about binaural cues, including clean cue and pre-enhanced cue, stored in the trained codebook. This method includes two main parts: offline training of cues and online enhancement by means of cues. That is, we use the trained codebook to model a priori information of speech and noise offline and extract the pre-enhanced cue from the noisy observation online. The clean cue is estimated by the mapping of the weighted code vectors online, and the enhanced speech is produced by the estimated clean cue. The experimental results show that the proposed approach performs better than the reference methods in both stationary and non-stationary noise condition.

[1]  Christof Faller,et al.  Binaural cue coding-Part I: psychoacoustic fundamentals and design principles , 2003, IEEE Trans. Speech Audio Process..

[2]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[3]  Brian R Glasberg,et al.  Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.

[4]  Hu Ruimin Speech wideband extension based on Gaussian mixture model , 2009 .

[5]  A.V. Oppenheim,et al.  Enhancement and bandwidth compression of noisy speech , 1979, Proceedings of the IEEE.

[6]  Christof Faller,et al.  Binaural cue coding-Part II: Schemes and applications , 2003, IEEE Trans. Speech Audio Process..

[7]  Christof Faller,et al.  Binaural cue coding: a novel and efficient representation of spatial audio , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Schuyler Quackenbush,et al.  Objective measures of speech quality , 1995 .

[9]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[10]  Philipos C. Loizou,et al.  A noise-estimation algorithm for highly non-stationary environments , 2006, Speech Commun..

[11]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[12]  W. Bastiaan Kleijn,et al.  Codebook driven short-term predictor parameter estimation for speech enhancement , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Israel Cohen,et al.  Simultaneous Detection and Estimation Approach for Speech Enhancement , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  W. Bastiaan Kleijn,et al.  Codebook-Based Bayesian Speech Enhancement for Nonstationary Environments , 2007, IEEE Transactions on Audio, Speech, and Language Processing.