Spectro-Temporal Filtering for Multichannel Speech Enhancement in Short-Time Fourier Transform Domain

In this letter, we propose a spectro-temporal filtering algorithm for multichannel speech enhancement in the short-time Fourier transform (STFT) domain. Compared with the traditional multiplicative filtering technique, the proposed method takes account of interdependencies between components in adjacent frames and frequency bins. For spectro-temporal filtering, speech and noise power spectral density (PSD) matrices are estimated based on an extended formulation utilizing temporal and spectral correlations, and the parametric noise reduction filter based on these PSD matrices is applied to the input microphone array signal. Moreover, multichannel speech presence probabilities are also estimated within a unified framework. A number of experimental results show that the proposed spectro-temporal filtering method improves the performance of multichannel speech enhancement.

[1]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[2]  Israel Cohen,et al.  System Identification in the Short-Time Fourier Transform Domain With Crossband Filtering , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Sharon Gannot,et al.  Adaptive Beamforming and Postfiltering , 2008 .

[4]  Jacob Benesty,et al.  On Optimal Frequency-Domain Multichannel Linear Filtering for Noise Reduction , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Jacob Benesty,et al.  An Integrated Solution for Online Multichannel Noise Tracking and Reduction , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[7]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[8]  Jacob Benesty,et al.  Gaussian Model-Based Multichannel Speech Presence Probability , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[10]  Israel Cohen,et al.  Adaptive System Identification in the Short-Time Fourier Transform Domain Using Cross-Multiplicative Transfer Function Approximation , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Ehud Weinstein,et al.  Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..

[12]  竹内 龍一 The image source method in room acoustics , 1953 .

[13]  L. J. Griffiths,et al.  An alternative approach to linearly constrained adaptive beamforming , 1982 .