A novel two stage single channel speech enhancement technique

A two stage novel speech enhancement algorithm is presented in this paper using speech and noise spectral estimators with hybrid a priori SNR combined with perceptual weighting filter. The first stage is designed such that it improves the performance of Minimum Mean Square Error Short-Time Spectral Amplitude (MMSE-STSA) estimator by using hybrid a priori SNR and combining statistical estimators of the spectral magnitude of the speech and noise. A perceptual weighting filter is designed and used as a second stage to improve the intelligibility of the processed speech signal. The performance of proposed scheme is tested and results indicate the performance of the proposed algorithm is better than several other MMSE-STSA algorithms available in literature.

[1]  Schuyler Quackenbush,et al.  Objective measures of speech quality , 1995 .

[2]  Yi Hu,et al.  Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. , 2009, The Journal of the Acoustical Society of America.

[3]  Ephraim Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .

[4]  Olivier Cappé,et al.  Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..

[5]  Ch. V. Rama Rao,et al.  Noise Estimation for Speech Enhancement in Non-Stationary Environments-A New Method , 2010 .

[6]  James D. Johnston,et al.  Transform coding of audio signals using perceptual noise criteria , 1988, IEEE J. Sel. Areas Commun..

[7]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[8]  Yi Hu,et al.  Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Yi Hu,et al.  Incorporating a psychoacoustical model in frequency domain speech enhancement , 2004, IEEE Signal Processing Letters.

[10]  Benoît Champagne,et al.  Incorporating the human hearing properties in the signal subspace approach for speech enhancement , 2003, IEEE Trans. Speech Audio Process..

[11]  Nathalie Virag,et al.  Single channel speech enhancement based on masking properties of the human auditory system , 1999, IEEE Trans. Speech Audio Process..

[12]  Andries P. Hekstra,et al.  Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[13]  John H. L. Hansen,et al.  An effective quality evaluation protocol for speech enhancement algorithms , 1998, ICSLP.

[14]  Yi Hu,et al.  Speech enhancement based on wavelet thresholding the multitaper spectrum , 2004, IEEE Transactions on Speech and Audio Processing.

[15]  Dennis H. Klatt,et al.  Prediction of perceived phonetic distance from critical-band spectra: A first step , 1982, ICASSP.

[16]  Ronald E. Crochiere,et al.  A study of complexity and quality of speech waveform coders , 1978, ICASSP.