A multi-channel postfilter based on the diffuse noise sound field

In this paper, we present a multi-channel Directional-to-Diffuse Postfilter (DD-PF), relying on the assumption of a directional speech signal embedded in diffuse noise. Our postfilter uses the output of a superdirective beamformer like the Generalized Sidelobe Canceller (GSC), which is projected back to the microphone inputs to separate the sound field into its directional and diffuse components. From these components the SNR at the output of the beamformer can be derived without needing a Voice Activity Detector (VAD). The SNR is used to construct a noise cancelling Wiener filter. In our experiments, the developed algorithm outperforms two recent postfilters based on the Transient Beam to Reference Ratio (TBRR) and the Multi-Channel Speech Presence Probability (MCSSP).

[1]  Emanuel A. P. Habets,et al.  Signal-to-reverberant ratio estimation based on the complex spatial coherence between omnidirectional microphones , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Jacob Benesty,et al.  Springer handbook of speech processing , 2007, Springer Handbooks.

[3]  S. Gannot,et al.  Speech enhancement based on the general transfer function GSC and postfiltering , 2004, IEEE Trans. Speech Audio Process..

[4]  I. Cohen Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator , 2002, IEEE Signal Processing Letters.

[5]  Emmanuel Vincent,et al.  Designing the Wiener post-filter for diffuse noise suppression using imaginary parts of inter-channel cross-spectra , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Jacob Benesty,et al.  An Integrated Solution for Online Multichannel Noise Tracking and Reduction , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Boaz Rafaely,et al.  Microphone Array Signal Processing , 2008 .

[8]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[9]  Israel Cohen,et al.  Relative Transfer Function Identification Using Convolutive Transfer Function Approximation , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Guy-Bart Stan,et al.  Comparison of different impulse response measurement techniques , 2002 .

[11]  Emmanuel Vincent,et al.  Improved Perceptual Metrics for the Evaluation of Audio Source Separation , 2012, LVA/ICA.

[12]  Emanuel A. P. Habets,et al.  MMSE-Based Blind Source Extraction in Diffuse Noise Fields Using a Complex Coherence-Based a Priori SAP Estimator , 2012, IWAENC.

[13]  Hervé Bourlard,et al.  Microphone array post-filter based on noise field coherence , 2003, IEEE Trans. Speech Audio Process..

[14]  Israel Cohen,et al.  Multichannel post-filtering in nonstationary noise environments , 2004, IEEE Transactions on Signal Processing.

[15]  Israel Cohen,et al.  Analysis of two-channel generalized sidelobe canceller (GSC) with post-filtering , 2003, IEEE Trans. Speech Audio Process..

[16]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[17]  Carla Teixeira Lopes,et al.  TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .

[18]  Israel Cohen,et al.  A sparse blocking matrix for multiple constraints GSC beamformer , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  Emmanuel Vincent,et al.  Subjective and Objective Quality Assessment of Audio Source Separation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.