Weighted Sigmoid-Based Frequency-Selective Noise Filtering for Speech Denoising

Estimation of noise often has a major impact on the quality of enhanced signal, especially when it comes in speech enhancement applications. The non-stationary noise statistics vary with time, making decision of speech active/inactive frame is however difficult. Further, since there is no prior information of noise distribution, the estimators use the recursive averaging with a fixed smoothing coefficient ranging from 0.70 to 0.99. This fixed smoothing coefficient actually correlates the previous frames of noise statistics. Unfortunately, using fixed smoothing coefficient, the estimator treats both speech active/inactive frames equally which may cause the leakage of speech/noise power and results in loss of speech intelligibility. To address this problem and to increase the noise estimation accuracy, this paper proposes a posteriori SNR and frequency dependent adaptive smoothing coefficient. Further, this paper investigates the performance of proposed weighted sigmoid function (WSIG) noise estimator. From both objective and subjective quality assessments, it is clearly evident that the proposed noise estimator yields considerably better tracking of noise spectral variations compared to the existing state of the art methods.

[1]  Rainer Martin,et al.  Speech enhancement using MMSE short time spectral estimation with gamma distributed speech priors , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Richard C. Hendriks,et al.  Unbiased MMSE-Based Noise Power Estimation With Low Complexity and Low Tracking Delay , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Rongshan Yu A low-complexity noise estimation algorithm based on smoothing of noise power estimation and estimation bias correction , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[5]  Ahmet M. Kondoz,et al.  Improved voice activity detection based on a smoothed statistical likelihood ratio , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[6]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[7]  Sven Nordholm,et al.  Optimization and evaluation of sigmoid function with a priori SNR estimate for real-time speech enhancement , 2013, Speech Commun..

[8]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[9]  Antonio Bonafonte,et al.  SEGAN: Speech Enhancement Generative Adversarial Network , 2017, INTERSPEECH.

[10]  Jesper Jensen,et al.  An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Peter Vary,et al.  Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model , 2005, EURASIP J. Adv. Signal Process..

[12]  Soo Ngee Koh,et al.  Wavelet for speech denoising , 1997, TENCON '97 Brisbane - Australia. Proceedings of IEEE TENCON '97. IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications (Cat. No.97CH36162).

[13]  Sven Nordholm,et al.  Statistical Voice Activity Detection Using Low-Variance Spectrum Estimation and an Adaptive Threshold , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Yongming Huang,et al.  Sea Clutter Cancellation for Passive Radar Sensor Exploiting Multi-Channel Adaptive Filters , 2019, IEEE Sensors Journal.

[15]  R. McAulay,et al.  Speech enhancement using a soft-decision noise suppression filter , 1980 .

[16]  Maneesh Kumar Singh Methods for Speech Intelligibility Enhancement , 2017 .

[17]  Philipos C. Loizou,et al.  Reasons why Current Speech-Enhancement Algorithms do not Improve Speech Intelligibility and Suggested Solutions , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Xavier Serra,et al.  A Wavenet for Speech Denoising , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[20]  Jesper Jensen,et al.  MMSE based noise PSD tracking with low complexity , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[22]  Rainer Martin,et al.  Spectral Subtraction Based on Minimum Statistics , 2001 .

[23]  DeLiang Wang,et al.  Supervised Speech Separation Based on Deep Learning: An Overview , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[24]  Kuldip K. Paliwal,et al.  Single-channel speech enhancement using spectral subtraction in the short-time modulation domain , 2010, Speech Commun..

[25]  Juan Carlos,et al.  Review of "Discrete-Time Speech Signal Processing - Principles and Practice", by Thomas Quatieri, Prentice-Hall, 2001 , 2003 .

[26]  Sven Nordholm,et al.  Bayesian noise estimation in the modulation domain , 2018, Speech Commun..

[27]  Thomas Quatieri,et al.  Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .

[28]  John H. L. Hansen,et al.  An effective quality evaluation protocol for speech enhancement algorithms , 1998, ICSLP.

[29]  Sven Nordholm,et al.  Noise Estimation Based on Soft Decisions and Conditional Smoothing for Speech Enhancement , 2012, IWAENC.

[30]  Eric Plourde,et al.  Auditory-Based Spectral Amplitude Estimators for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[31]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[32]  I. Cohen,et al.  Noise estimation by minima controlled recursive averaging for robust speech enhancement , 2002, IEEE Signal Processing Letters.

[33]  Kuldip K. Paliwal,et al.  Speech enhancement using a minimum mean-square error short-time spectral modulation magnitude estimator , 2012, Speech Commun..