An invertible frequency eigendomain transformation for masking-based subspace speech enhancement

Masking properties have been widely exploited in speech enhancement techniques, especially those implemented in the spectral domain. The incorporation of auditory masking in a subspace technique invariably requires a transformation linking the frequency and eigendomains. In this letter, an invertible transformation between the frequency and eigendomains is derived. The proposed transformation is verified through a conventional masking-based subspace speech-enhancement method. Simulation results show that our proposed transformation for speech enhancement outperforms the conventional transformation in terms of segmental signal-to-noise ratio (SNR), perceptual evaluation of speech quality (PESQ), and listening tests.

[1]  Benoît Champagne,et al.  A perceptual signal subspace approach for speech enhancement in colored noise , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  James D. Johnston,et al.  Transform coding of audio signals using perceptual noise criteria , 1988, IEEE J. Sel. Areas Commun..

[3]  Monson H. Hayes,et al.  Statistical Digital Signal Processing and Modeling , 1996 .

[4]  Benoît Champagne,et al.  Incorporating the human hearing properties in the signal subspace approach for speech enhancement , 2003, IEEE Trans. Speech Audio Process..

[5]  Yariv Ephraim,et al.  A signal subspace approach for speech enhancement , 1995, IEEE Trans. Speech Audio Process..

[6]  Susanto Rahardja,et al.  Kalman filtering speech enhancement incorporating masking properties for mobile communication in a car environment , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[7]  Susanto Rahardja,et al.  An MMSE speech enhancement approach incorporating masking properties , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.