Multizone near-end speech enhancement under optimal second-order magnitude distortion

In this article, we address near-end speech enhancement for a scenario where there are several playback zones. A signal model is explored, where effects of noise, reverberation and zone crosstalk are taken into account simultaneously. Through the symbolic usage of a general smooth distortion measure, necessary optimality conditions are derived. The conditions are applied to a DFT magnitude-based distortion measure and an algorithm follows, which applies per-zone spectral subtraction followed by channel inversion. Simulations validate the optimality of the algorithm and show a clear benefit in multizone processing, as opposed to the iterated application of a single-zone algorithm.

[1]  John G. Harris,et al.  Applied principles of clear and Lombard speech for automated intelligibility enhancement in noisy environments , 2006, Speech Commun..

[2]  Peter Vary,et al.  Near End Listening Enhancement: Speech Intelligibility Improvement in Noisy Environments , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[3]  Richard Heusdens,et al.  A speech preprocessing strategy for intelligibility improvement in noise based on a perceptual distortion measure , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Jesper Jensen,et al.  An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  J D Griffiths Optimum linear filter for speech transmission. , 1968, The Journal of the Acoustical Society of America.

[6]  J. L. Hall,et al.  Intelligibility and listener preference of telephone speech in the presence of babble noise. , 2010, The Journal of the Acoustical Society of America.

[7]  Gary W. Elko,et al.  Virtual sound using loudspeakers: robust acoustic crosstalk cancellation , 2000 .

[8]  Susanto Rahardja,et al.  Adaptive /spl beta/-order MMSE estimation for speech enhancement , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[9]  J. H. van Lint,et al.  Functions of one complex variable II , 1997 .

[10]  Jesper Jensen,et al.  DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement , 2013, DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement.

[11]  Peter Vary,et al.  Recursive Closed-Form Optimization of Spectral Audio Power Allocation for Near End Listening Enhancement , 2010, Sprachkommunikation.