Single-channel speech enhancement in variable noise-level environment

Discusses the problem of single-channel speech enhancement in variable noise-level environment. Commonly used, single-channel subtractive-type speech enhancement algorithms always assume that the background noise level is fixed or slowly varying. In fact, the background noise level may vary quickly. This condition usually results in wrong speech/noise detection and wrong speech enhancement process. In order to solve this problem, we propose a subtractive-type speech enhancement scheme. This new enhancement scheme uses the RTF (refined time-frequency parameter)-based RSONFIN (recurrent self-organizing neural fuzzy inference network) algorithm we developed previously to detect the word boundaries in the condition of variable background noise level. In addition, a new parameter (MiFre) is proposed to estimate the varying background noise level. Based on this parameter, the noise level information used for subtractive-type speech enhancement can be estimated not only during speech pauses, but also during speech segments. This new subtractive-type enhancement scheme has been tested and found to perform well, not only in variable background noise level condition, but also in fixed background noise level condition.

[1]  Fikret Gürgen,et al.  Speech enhancement by Fourier-Bessel coefficients of speech and noise , 1990 .

[2]  Chin-Teng Lin,et al.  A recurrent self-organizing neural fuzzy inference network , 1999, IEEE Trans. Neural Networks.

[3]  Chin-Teng Lin,et al.  A recurrent neural fuzzy network for word boundary detection in variable noise-level environments , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[4]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[5]  Jean-Claude Junqua,et al.  A robust algorithm for word boundary detection in the presence of noise , 1994, IEEE Trans. Speech Audio Process..

[6]  Mohammad Hasan Savoji,et al.  A robust algorithm for accurate endpointing of speech signals , 1989, Speech Commun..

[7]  Hamid Sheikhzadeh,et al.  HMM-based strategies for enhancement of speech signals embedded in nonstationary noise , 1998, IEEE Trans. Speech Audio Process..

[8]  J. Allen,et al.  Cochlear modeling , 1985, IEEE ASSP Magazine.

[9]  John H. L. Hansen,et al.  Speech enhancement using a constrained iterative sinusoidal model , 2001, IEEE Trans. Speech Audio Process..

[10]  L. Rabiner,et al.  An algorithm for determining the endpoints of isolated utterances , 1974, The Bell System Technical Journal.

[11]  Chin-Teng Lin,et al.  An adaptive neural fuzzy filter and its applications , 1997, IEEE Trans. Syst. Man Cybern. Part B.

[12]  Ben Reaves Comments on 'An improved endpoint detector for isolated word recognition' , 1991, IEEE Trans. Signal Process..

[13]  Chia-Feng Juang,et al.  A recurrent self-organizing neural fuzzy inference network , 1997, Proceedings of 6th International Fuzzy Systems Conference.

[14]  George S. Moschytz,et al.  Neural network filters for speech enhancement , 1995, IEEE Trans. Speech Audio Process..

[15]  George G. Coghill,et al.  A mapping neural network and its application to voiced-unvoiced-silence classification , 1993, Proceedings 1993 The First New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems.

[16]  A. Gray Modern Differential Geometry of Curves and Surfaces , 1993 .

[17]  Nathalie Virag,et al.  Single channel speech enhancement based on masking properties of the human auditory system , 1999, IEEE Trans. Speech Audio Process..

[18]  Aaron E. Rosenberg,et al.  An improved endpoint detector for isolated word recognition , 1981 .

[19]  M. Lorber,et al.  A combined approach for broadband noise reduction , 1997, Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics.

[20]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[21]  H. Piaggio Differential Geometry of Curves and Surfaces , 1952, Nature.

[22]  Saeed Gazor,et al.  An adaptive KLT approach for speech enhancement , 2001, IEEE Trans. Speech Audio Process..

[23]  Jae S. Lim,et al.  Speech enhancement , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[24]  Bobby R. Hunt,et al.  Voiced-unvoiced-silence classifications of speech using hybrid features and a network classifier , 1993, IEEE Trans. Speech Audio Process..

[25]  Bernhard M. J. Leiner Noise-Robust Speech Recognition , 2004 .

[26]  C. S. George Lee,et al.  Neural fuzzy systems: a neuro-fuzzy synergism to intelligent systems , 1996 .

[27]  Yumi Takizawa,et al.  A noise robust speech recognition system , 1990, ICSLP.

[28]  Jérôme Boudy,et al.  Experiments with a nonlinear spectral subtractor (NSS), Hidden Markov models and the projection, for robust speech recognition in cars , 1991, Speech Commun..

[29]  Beth Logan,et al.  Adaptive model-based speech enhancement , 2001, Speech Commun..