A new perceptual post-filter for single channel speech enhancement

A major drawback of many speech enhancement methods in speech applications is the generation of an annoying residual noise with musical character. A potential solution to this artifact is the incorporation of a psychoacoustic model in the suppression filter design. In this paper a frequency domain optimal linear estimator with perceptual post-filtering is proposed, which incorporates the masking properties of the human hearing system to make the residual noise distortion inaudible. The performance of the proposed enhancement algorithm is evaluated by the Segmental SNR and Perceptual Evaluation of Speech Quality (PESQ) measures under various noisy environments and yields better results compared to the Wiener denoising technique.

[1]  Richard V. Cox,et al.  A modular approach to speech enhancement with an application to speech coding , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[2]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .

[3]  James D. Johnston,et al.  Transform coding of audio signals using perceptual noise criteria , 1988, IEEE J. Sel. Areas Commun..

[4]  Douglas D. O'Shaughnessy,et al.  Speech enhancement based conceptually on auditory evidence , 1991, IEEE Trans. Signal Process..

[5]  Soo Ngee Koh,et al.  Post-processing in masking-based β-order MMSE speech enhancement , 2008 .

[6]  Schuyler Quackenbush,et al.  Objective measures of speech quality , 1995 .

[7]  Yi Hu,et al.  Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Sofia Ben Jebara,et al.  Perceptual musical noise reduction using critical bands tonality coefficients and masking thresholds , 2007, INTERSPEECH.

[9]  Robert B. Dunn,et al.  Speech enhancement based on auditory spectral change , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Hugo Van hamme,et al.  A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition , 2007, EURASIP J. Adv. Signal Process..

[11]  Douglas D. O'Shaughnessy,et al.  Speech enhancement based on novel two-step a priori SNR estimators , 2008, INTERSPEECH.

[12]  A. Estimator Speech Enhancement Using a- Minimum Mean- Square Error Short-Time Spectral , 1984 .

[13]  Susanto Rahardja,et al.  Masking-based beta-order MMSE speech enhancement , 2006, Speech Commun..

[14]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[15]  M. Melamed Detection , 2021, SETI: Astronomy as a Contact Sport.

[16]  John Mourjopoulos,et al.  Speech enhancement using psychoacoustic criteria , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[17]  William M. Hartmann,et al.  Psychoacoustics: Facts and Models , 2001 .

[18]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[19]  Akihiko Sugiyama,et al.  Noise Suppression with High Speech Quality Based on Weighted Noise Estimation and MMSE STSA , 2002, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[20]  Olivier Cappé,et al.  Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..

[21]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[22]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[23]  Nathalie Virag,et al.  Single channel speech enhancement based on masking properties of the human auditory system , 1999, IEEE Trans. Speech Audio Process..