Audible Noise Reduction in Eigendomain for Speech Enhancement

A signal subspace scheme based on masking properties is proposed for enhancement of speech degraded by additive noise. Since the masking properties are related to the critical frequency band that is derived from the characteristics of human cochlea, the incorporation of masking threshold into a subspace technique requires the transformation between the frequency and eigen domains. We present and apply an invertible transformation between the frequency and eigen domains. In this paper, we use masking properties of the human auditory system to define the audible noise quantity in the eigendomain. We derive the eigen-decomposition of the estimated speech autocorrelation matrix with the assumption of white noise. Subsequently, an audible noise reduction scheme is developed based on a signal subspace technique, and the implementation of our proposed scheme is outlined. We further extend the scheme to the colored noise case. Simulation results show the superiority of our proposed scheme over other existing subspace methods in terms of segmental signal-to-noise ratio (SNR), perceptual evaluation of speech quality (PESQ), modified Bark spectral distortion (MBSD), spectrogram and informal listening tests.

[1]  Yi Hu,et al.  A perceptually motivated approach for speech enhancement , 2003, IEEE Trans. Speech Audio Process..

[2]  Yariv Ephraim,et al.  A signal subspace approach for speech enhancement , 1995, IEEE Trans. Speech Audio Process..

[3]  Andrew C. Simpson,et al.  Enhancement techniques to improve the intelligibility of consonants in noise : speaker and listener effects , 1998, ICSLP.

[4]  Susanto Rahardja,et al.  An invertible frequency eigendomain transformation for masking-based subspace speech enhancement , 2005, IEEE Signal Processing Letters.

[5]  Saeed Gazor,et al.  An adaptive KLT approach for speech enhancement , 2001, IEEE Trans. Speech Audio Process..

[6]  Jerry D. Gibson,et al.  Filtering of colored noise for speech enhancement and coding , 1991, IEEE Trans. Signal Process..

[7]  Monson H. Hayes,et al.  Statistical Digital Signal Processing and Modeling , 1996 .

[8]  Biing-Hwang Juang,et al.  On the application of hidden Markov models for enhancing noisy speech , 1989, IEEE Trans. Acoust. Speech Signal Process..

[9]  Soo Ngee Koh,et al.  Post-processing in masking-based β-order MMSE speech enhancement , 2008 .

[10]  Mark Klein,et al.  Signal subspace speech enhancement with perceptual post-filtering , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Susanto Rahardja,et al.  Signal subspace speech enhancement for audible noise reduction , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[12]  Benoît Champagne,et al.  A perceptual signal subspace approach for speech enhancement in colored noise , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[14]  Y. Ephraim,et al.  Extension of the signal subspace speech enhancement approach to colored noise , 2003, IEEE Signal Processing Letters.

[15]  Te-Won Lee,et al.  A Spatio-Temporal Speech Enhance Speech Recogn , 2002 .

[16]  John Mourjopoulos,et al.  Speech enhancement using psychoacoustic criteria , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  John Mourjopoulos,et al.  Speech enhancement based on audible noise suppression , 1997, IEEE Trans. Speech Audio Process..

[18]  Yi Hu,et al.  A generalized subspace approach for enhancing speech corrupted by colored noise , 2003, IEEE Trans. Speech Audio Process..

[19]  Susanto Rahardja,et al.  /spl beta/-order MMSE spectral amplitude estimation for speech enhancement , 2005, IEEE Transactions on Speech and Audio Processing.

[20]  Susanto Rahardja,et al.  Masking-based beta-order MMSE speech enhancement , 2006, Speech Commun..

[21]  B. Atal,et al.  Optimizing digital speech coders by exploiting masking properties of the human ear , 1978 .

[22]  J H Hansen,et al.  Robust estimation of speech in noisy backgrounds based on aspects of the auditory process. , 1995, The Journal of the Acoustical Society of America.

[23]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[24]  James D. Johnston,et al.  Transform coding of audio signals using perceptual noise criteria , 1988, IEEE J. Sel. Areas Commun..

[25]  R. Hellman Asymmetry of masking between noise and tone , 1972 .

[26]  Gene H. Golub,et al.  Matrix computations , 1983 .

[27]  Nam C. Phamdo,et al.  Signal/noise KLT based approach for enhancing speech degraded by colored noise , 2000, IEEE Trans. Speech Audio Process..

[28]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[29]  Soo Ngee Koh,et al.  Low distortion speech enhancement , 2000 .