A DCT-based noisy speech enhancement method using Teager energy operator

This paper introduces an improved Discrete Cosine Transform (DCT) based speech enhancement method using the Teager energy operator (TEO). In this method, the TEO is employed upon the DCT coefficients of the noisy speech in order to develop a time-adaptive threshold value with lower computational burden. Unlike conventional thresholding-based speech enhancement schemes, the proposed method improves the perceptual quality of the enhanced speech substantially by overcoming the the problem of over thresholding of speech segments. In addition, the proposed method does not require a complicated estimation of the noise level or any knowledge of the Signal to Noise Ratio (SNR). The experimental results show that the proposed method is significantly more effective in reduction of not only the white noise but also the color noise from the noisy speech signal with different levels of SNR, thus garnering better performance in standard objective measures as well as subjective evaluations compared to some recent state-of-the-art methods of noisy speech enhancement.

[1]  Philipos C. Loizou,et al.  Speech enhancement based on perceptually motivated bayesian estimators of the magnitude spectrum , 2005, IEEE Transactions on Speech and Audio Processing.

[2]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice, Second Edition , 2013 .

[3]  Soo Ngee Koh,et al.  Noisy speech enhancement using discrete cosine transform , 1998, Speech Commun..

[4]  Ing Yann Soon,et al.  An adaptive time-shift analysis for DCT based speech enhancement , 2009, 2009 7th International Conference on Information, Communications and Signal Processing (ICICS).

[5]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[6]  J. Rouat,et al.  Wavelet speech enhancement based on the Teager energy operator , 2001, IEEE Signal Processing Letters.

[7]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[8]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[9]  John H. L. Hansen,et al.  Speech Enhancement Based on Generalized Minimum Mean Square Error Estimators and Masking Properties of the Auditory System , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Sven Nordholm,et al.  Spectral subtraction using reduced delay convolution and adaptive averaging , 2001, IEEE Trans. Speech Audio Process..

[11]  David L. Donoho,et al.  De-noising by soft-thresholding , 1995, IEEE Trans. Inf. Theory.

[12]  Ben P. Milner,et al.  Visually Derived Wiener Filters for Speech Enhancement , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  E. Ambikairajah,et al.  An improved soft threshold method for DCT speech enhancement , 2008, 2008 Second International Conference on Communications and Electronics.

[14]  James F. Kaiser,et al.  Some useful properties of Teager's energy operators , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  Yi Hu,et al.  Subjective comparison and evaluation of speech enhancement algorithms , 2007, Speech Commun..

[16]  Philipos C. Loizou,et al.  A multi-band spectral subtraction method for enhancing speech corrupted by colored noise , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  Jhing-Fa Wang,et al.  Speech Enhancement Using Perceptual Wavelet Packet Decomposition and Teager Energy Operator , 2004, J. VLSI Signal Process..