论文信息 - Decision-directed speech power spectral density matrix estimation for multichannel speech enhancement.

Decision-directed speech power spectral density matrix estimation for multichannel speech enhancement.

In this letter, a multichannel decision-directed approach to estimate the speech power spectral density (PSD) matrix for multichannel speech enhancement is proposed. There have been attempts to build multichannel speech enhancement filters which depend only on the speech and noise PSD matrices, for which the accurate estimate of the clean speech PSD matrix is crucial for a successful noise reduction. In contrast to the maximum likelihood estimator which has been applied conventionally, the proposed decision-directed method is capable of tracking the time-varying speech characteristics more robustly and improves the noise reduction performance under various noise environments.

Nam Soo Kim | Yu Gwang Jin | Jong Won Shin

[1] Marc Moonen,et al. Spatially pre-processed speech distortion weighted multi-channel Wiener filtering for noise reduction , 2003, Signal Process..

[2] L. J. Griffiths,et al. An alternative approach to linearly constrained adaptive beamforming , 1982 .

[3] O. L. Frost,et al. An algorithm for linearly constrained adaptive array processing , 1972 .

[4] Israel Cohen,et al. Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[5] Emanuel A. P. Habets,et al. Spherical harmonic domain noise reduction using an MVDR beamformer and DOA-based second-order statistics estimation , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6] Jacob Benesty,et al. An Integrated Solution for Online Multichannel Noise Tracking and Reduction , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[7] Jacob Benesty,et al. Gaussian Model-Based Multichannel Speech Presence Probability , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[8] Emanuel A. P. Habets,et al. MMSE-Based Blind Source Extraction in Diffuse Noise Fields Using a Complex Coherence-Based a Priori SAP Estimator , 2012, IWAENC.

[9] Ehud Weinstein,et al. Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..

[10] Nam Soo Kim,et al. Spectro-Temporal Filtering for Multichannel Speech Enhancement in Short-Time Fourier Transform Domain , 2014, IEEE Signal Processing Letters.

[11] Jacob Benesty,et al. On Optimal Frequency-Domain Multichannel Linear Filtering for Noise Reduction , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[12] Olivier Cappé,et al. Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..

[13] S. Gannot,et al. Speech enhancement based on the general transfer function GSC and postfiltering , 2004, IEEE Trans. Speech Audio Process..

[14] Marc Moonen,et al. Frequency-domain criterion for the speech distortion weighted multichannel Wiener filter for robust noise reduction , 2007, Speech Commun..

[15] Nam Soo Kim,et al. Parametric multichannel noise reduction algorithm utilizing temporal correlations in reverberant environment , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16] Marc Moonen,et al. Variable Speech Distortion Weighted Multichannel Wiener Filter based on Soft Output Voice Activity Detection for Noise Reduction in Hearing Aids , 2008 .