Speech enhancement using adaptive thresholding based on gamma distribution of Teager energy operated intrinsic mode functions

This paper introduces a new speech enhancement algorithm based on the adaptive threshold of intrinsic mode functions (IMFs) of noisy signal frames extracted by empirical mode decomposition. Adaptive threshold values are estimated by using the gamma statistical model of Teager energy operated IMFs of noisy speech and estimated noise based on symmetric Kullback--Leibler divergence. The enhanced speech signal is obtained by a semisoft thresholding function, which is utilized by threshold IMF coefficients of noisy speech. The method is tested on the NOIZEUS speech database and the proposed method is compared with wavelet-shrinkage and EMD-shrinkage methods in terms of segmental SNR improvement (SegSNR), weighted spectral slope (WSS), and perceptual evaluation of speech quality (PESQ). Experimental results show that the proposed method provides a higher SegSNR improvement in dB, lower WSS distance, and higher PESQ scores than wavelet-shrinkage and EMD-shrinkage methods. The proposed method shows better performance than traditional threshold-based speech enhancement approaches from high to low SNR levels.

[1]  Rajib Sharma,et al.  A better decomposition of speech obtained using modified Empirical Mode Decomposition , 2016, Digit. Signal Process..

[2]  James F. Kaiser,et al.  Some useful properties of Teager's energy operators , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  J. Rouat,et al.  Wavelet speech enhancement based on the Teager energy operator , 2001, IEEE Signal Processing Letters.

[4]  C. Shahnaz,et al.  A combination of semisoft and μ-law thresholding functions for enhancing noisy speech in wavelet packet domain , 2012, 2012 7th International Conference on Electrical and Computer Engineering.

[5]  Abdel-Ouahab Boudraa,et al.  Speech Enhancement via EMD , 2008, EURASIP J. Adv. Signal Process..

[6]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[7]  Yi Hu,et al.  Subspace algorithms for noise reduction in cochlear implants. , 2005, The Journal of the Acoustical Society of America.

[8]  Yi Hu,et al.  Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Nam Soo Kim,et al.  Spectral enhancement based on global soft decision , 2000, IEEE Signal Processing Letters.

[10]  Pengfei Sun,et al.  Wavelet Packet Transform based Speech Enhancement via Two-Dimensional SPP Estimator with Generalized Gamma Priors , 2016 .

[11]  Mina Kemiha,et al.  Empirical mode decomposition and normalshrink tresholding for speech denoising , 2014, ArXiv.

[12]  Keikichi Hirose,et al.  Speech enhancement using soft thresholding with DCT-EMD based hybrid algorithm , 2007, 2007 15th European Signal Processing Conference.

[13]  K. Khaldi Processing and analysis of sounds signals by Huang transform (Empirical Mode Decomposition: EMD) , 2012 .

[14]  Steve McLaughlin,et al.  Development of EMD-Based Denoising Methods Inspired by Wavelet Thresholding , 2009, IEEE Transactions on Signal Processing.

[15]  I. Johnstone,et al.  Ideal spatial adaptation by wavelet shrinkage , 1994 .

[16]  Tahsina Farah Sanam,et al.  Teager Energy Operation on Wavelet Packet Coefficients for Enhancing Noisy Speech Using a Hard Thresholding Function , 2012 .

[17]  Udo Zölzer,et al.  A single channel speech enhancement technique exploiting human auditory masking properties , 2010 .

[18]  Wei-Ping Zhu,et al.  Speech Enhancement Based on Student $t$ Modeling of Teager Energy Operated Perceptual Wavelet Packet Coefficients and a Custom Thresholding Function , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[19]  Leonardo Zao,et al.  Speech Enhancement with EMD and Hurst-Based Mode Selection , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[20]  Md. Khademul Islam Molla,et al.  Single Channel Speech Enhancement Using Adaptive Soft-Thresholding with Bivariate EMD , 2013 .

[21]  Wei-Ping Zhu,et al.  Rayleigh modeling of teager energy operated perceptual wavelet packet coefficients for enhancing noisy speech , 2017, Speech Commun..

[22]  David L. Donoho,et al.  De-noising by soft-thresholding , 1995, IEEE Trans. Inf. Theory.

[23]  Soo Ngee Koh,et al.  Improved noise suppression filter using self-adaptive estimator of probability of speech absence , 1999, Signal Process..

[24]  James M. Joyce Kullback-Leibler Divergence , 2011, International Encyclopedia of Statistical Science.

[25]  Keikichi Hirose,et al.  Speech Enhancement Using EMD Based Adaptive Soft-Thresholding (EMD-ADT) , 2012 .

[26]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[27]  Israel Cohen,et al.  Enhancement of speech using bark-scaled wavelet packet decomposition , 2001, INTERSPEECH.

[28]  Petros Maragos,et al.  Speech nonlinearities, modulations, and energy operators , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[29]  Kuldip K. Paliwal,et al.  Modulation domain spectral subtraction for speech enhancement , 2009, INTERSPEECH.

[30]  Y. Ephraim Statistical model-based speech enhancement systems , 1988 .

[31]  Yasser Ghanbari,et al.  A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets , 2006, Speech Commun..

[32]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[33]  M. H. Savoji,et al.  Speech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering , 2014 .

[34]  John R. Hershey,et al.  Approximating the Kullback Leibler Divergence Between Gaussian Mixture Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[35]  D RudreshM. EMD BASED SPEECH ENHANCEMENT USING SOFT AND HARD THRESHOLD TECHNIQUES , 2016 .

[36]  Yi Hu,et al.  A comparative intelligibility study of single-microphone noise reduction algorithms. , 2007, The Journal of the Acoustical Society of America.

[37]  Jesper Jensen,et al.  Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[38]  Abdel-Ouahab Boudraa,et al.  Speech enhancement using empirical mode decomposition and the Teager-Kaiser energy operator. , 2014, The Journal of the Acoustical Society of America.

[39]  Ceyhan Kasap,et al.  A unified approach to speech enhancement and voice activity detection , 2013 .

[40]  Yang Lu,et al.  A geometric approach to spectral subtraction , 2008, Speech Commun..