Blind source extraction based on a direction-dependent a-priori SNR

In many hands-free applications, we encounter a speaker located in the near-field embedded in diffuse far-field noise. In this paper, we contribute an algorithm to estimate the speech and noise power spectral density (PSD) based on a directiondependent SNR (DD-SNR). The only prior knowledge needed is a model of the diffuse noise sound field. The enhanced speech signal is obtained by a parametric multi-channel Wiener filter (PMWF), which is constructed without any speech presence or absence probabilities, or smoothing in frequency. We achieve high speech quality and sufficient noise reduction by iteratively improving the speech PSD estimate using the output of the PMWF. The performance of our algorithm is demonstrated by using the PESQ and PEASS measures.

[1]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[2]  Jacob Benesty,et al.  Springer handbook of speech processing , 2007, Springer Handbooks.

[3]  S. Gannot,et al.  Speech enhancement based on the general transfer function GSC and postfiltering , 2004, IEEE Trans. Speech Audio Process..

[4]  Israel Cohen,et al.  Analysis of two-channel generalized sidelobe canceller (GSC) with post-filtering , 2003, IEEE Trans. Speech Audio Process..

[5]  Emmanuel Vincent,et al.  Subjective and Objective Quality Assessment of Audio Source Separation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Emanuel A. P. Habets,et al.  MMSE-Based Blind Source Extraction in Diffuse Noise Fields Using a Complex Coherence-Based a Priori SAP Estimator , 2012, IWAENC.

[7]  Jingdong Chen,et al.  Acoustic MIMO Signal Processing , 2006 .

[8]  Israel Cohen,et al.  A sparse blocking matrix for multiple constraints GSC beamformer , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Boaz Rafaely,et al.  Microphone Array Signal Processing , 2008 .

[10]  Hervé Bourlard,et al.  Microphone array post-filter based on noise field coherence , 2003, IEEE Trans. Speech Audio Process..

[11]  Jacob Benesty,et al.  An Integrated Solution for Online Multichannel Noise Tracking and Reduction , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Emanuel A. P. Habets,et al.  Signal-to-reverberant ratio estimation based on the complex spatial coherence between omnidirectional microphones , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Emmanuel Vincent,et al.  Improved Perceptual Metrics for the Evaluation of Audio Source Separation , 2012, LVA/ICA.

[14]  Jacob Benesty,et al.  On Microphone-Array Beamforming From a MIMO Acoustic Signal Processing Perspective , 2007, IEEE Transactions on Audio, Speech, and Language Processing.