论文信息 - Binaural sound segregation for multisource reverberant environments

Binaural sound segregation for multisource reverberant environments

We present a novel method for binaural sound segregation from acoustic mixtures contaminated by both multiple interference and reverberation. We employ the notion of an ideal time-frequency binary mask, which selects the target if it is stronger than the interference in a local time-frequency (T-F) unit. As opposed to classical adaptive filtering, which focuses on the suppression of noise, our model employs an adaptive filter that performs target cancellation. T-F units dominated by a target are largely suppressed at the output of the cancellation unit when compared to units dominated by noise. Consequently, the actual input-to-output attenuation level in each T-F unit is used to estimate an ideal binary mask. A systematic evaluation in terms of automatic speech recognition performance shows that the resulting system produces masks close to ideal binary ones.

DeLiang Wang | Nicoleta Roman | Deliang Wang | N. Roman

[1] Phil D. Green,et al. Robust automatic speech recognition with missing and unreliable acoustic data , 2001, Speech Commun..

[2] Hervé Glotin,et al. A CASA-labelling model using the localisation cue for robust cocktail-party speech recognition , 1999, EUROSPEECH.

[3] DeLiang Wang,et al. Speech segregation based on sound localization , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[4] Guy J. Brown,et al. A binaural processor for missing data speech recognition in the presence of noise and small-room reverberation , 2004, Speech Commun..

[5] Bill Gardner,et al. HRTF Measurements of a KEMAR Dummy-Head Microphone , 1994 .

[6] V. Rodellar,et al. Speech enhancement and source separation supported by negative beamforming filtering , 2002, 6th International Conference on Signal Processing, 2002..

[7] Özgür Yilmaz,et al. Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[8] DeLiang Wang,et al. Speech segregation based on pitch tracking and amplitude modulation , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[9] B C Wheeler,et al. A two-microphone dual delay-line approach for extraction of a speech sound in the presence of multiple interferers. , 2001, The Journal of the Acoustical Society of America.

[10] Simon Haykin,et al. Adaptive Filter Theory 4th Edition , 2002 .

[11] Mingyang Wu,et al. Pitch tracking and speech enhancement in noisy and reverberant environments , 2003 .

[12] S. Haykin,et al. Adaptive Filter Theory , 1986 .