Online inter-frame correlation estimation methods for speech enhancement in frequency subbands

In this paper, we propose solutions for the online adaptation of optimal FIR filters for speech enhancement in DFT subbands. An important ingredient to such filters is the estimation of the inter-frame correlation of the clean speech signal. While this correlation was assumed to be perfectly known in former studies, we discuss two online estimation approaches based on a constant noise inter-frame correlation and on the use of a binary mask. We show that a filtering of subband signals based on these estimated quantities outperforms a conventional, instantaneous spectral weighting, such as the frequency-domain Wiener filter at least for high SNR conditions.

[1]  Thomas Esch,et al.  Speech enhancement using a modified Kalman filter based on complex linear prediction and supergaussian priors , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Richard C. Hendriks,et al.  Unbiased MMSE-Based Noise Power Estimation With Low Complexity and Low Tracking Delay , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  P. Vaidyanathan Multirate Systems And Filter Banks , 1992 .

[4]  Sharon Gannot,et al.  Adaptive Beamforming and Postfiltering , 2008 .

[5]  Peter Vary,et al.  Exploiting Temporal Correlation of Speech and Noise Magnitudes Using a Modified Kalman Filter for Speech Enhancement , 2011 .

[6]  Wen-Rong Wu,et al.  Subband Kalman filtering for speech enhancement , 1998 .

[7]  E. Hänsler,et al.  Acoustic Echo and Noise Control: A Practical Approach , 2004 .

[8]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[9]  Yang Lu,et al.  An algorithm that improves speech intelligibility in noise for normal-hearing listeners. , 2009, The Journal of the Acoustical Society of America.

[10]  Jingdong Chen,et al.  Microphone Array Signal Processing , 2008 .

[11]  J. Capon High-resolution frequency-wavenumber spectrum analysis , 1969 .

[12]  Gene H. Golub,et al.  Matrix computations , 1983 .

[13]  Yi Hu,et al.  Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Henning Puder Kalman-filters in subbands for noise reduction with enhanced pitch-adaptive speech model estimation , 2002, Eur. Trans. Telecommun..

[15]  Michael S. Brandstein,et al.  Microphone Arrays - Signal Processing Techniques and Applications , 2001, Microphone Arrays.

[16]  Emanuel A. P. Habets,et al.  New Insights Into the MVDR Beamformer in Room Acoustics , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[18]  Rainer Martin,et al.  Advances in Digital Speech Transmission , 2008 .

[19]  Peter Vary,et al.  Digital Speech Transmission: Enhancement, Coding and Error Concealment , 2006 .

[20]  Rainer Martin,et al.  Efficient Implementation of Single-Channel Noise Reduction for Hearing Aids Using a Cascaded Filter-Bank , 2012, ITG Conference on Speech Communication.

[21]  Jacob Benesty,et al.  A single-channel noise reduction MVDR filter , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  Woon-Seng Gan,et al.  Subband Adaptive Filtering: Theory and Implementation , 2009 .

[23]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[24]  Jacob Benesty,et al.  A Multi-Frame Approach to the Frequency-Domain Single-Channel Noise Reduction Problem , 2012, IEEE Transactions on Audio, Speech, and Language Processing.