On the optimization of sigmoid function for speech enhancement

This paper develops a methodology to optimize sigmoid function parameters based on a weighted sum of two objective measures, which are the perceptual evaluation of speech quality (PESQ) measure and the log-likelihood ratio (LLR) measure. The sigmoid function has been investigated for speech enhancement as an alternative gain function to the conventional MMSE function and the spectral subtraction function. The benefit of using this function is that it has tunable parameters for both its slope and its mean. It also provides a potential to preserve more speech signal at high SNR level. The SNR estimate and the gain function impact the value of the objective measures such as PESQ and LLR, and provide varying subjective quality. Thus, by studying the relationship between the SNR estimate and the gain function, the performance of a single channel speech enhancement scheme can be optimized. Here, we aim to optimize the parameters of sigmoid function for different types of noise conditions and SNRs. Subjective listening tests demonstrate a significant improvement in the objective measures with proper choice of parameters.

[1]  Sven Nordholm,et al.  Statistical Voice Activity Detection Using Low-Variance Spectrum Estimation and an Adaptive Threshold , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Peter Jax,et al.  A psychoacoustic approach to combined acoustic echo cancellation and noise reduction , 2002, IEEE Trans. Speech Audio Process..

[3]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[4]  D. O'Shaughnessy,et al.  Speech enhancement employing a sigmoid -type gain function with a modified a priori signal-to-noise ratio (SNR) estimator , 2008, 2008 Canadian Conference on Electrical and Computer Engineering.

[5]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[6]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[7]  Yi Hu,et al.  Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Rainer Martin,et al.  Spectral Subtraction Based on Minimum Statistics , 2001 .

[9]  Yi Hu,et al.  Use of a sigmoidal-shaped function for noise attenuation in cochlear implants. , 2007, The Journal of the Acoustical Society of America.

[10]  Sven Nordholm,et al.  Spectral subtraction using reduced delay convolution and adaptive averaging , 2001, IEEE Trans. Speech Audio Process..

[11]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[12]  Junfeng Li,et al.  Adaptive beta-order generalized spectral subtraction for speech enhancement , 2008, Signal Process..

[13]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .