Inventory based speech enhancement for speaker dedicated speech communication systems

We are presenting a method for the enhancement of speech in speaker dedicated speech communication systems. The proposed procedure is fundamentally different from most state-of-the-art filtering approaches. Instead of filtering a distorted signal we are re-synthesizing a new “clean” signal based on its likely characteristics. These characteristics are estimated from the distorted signal. We present a successful implementation of the proposed method for a communication system for which speaker enrollment and noise enrollment are feasible. Forty minutes of clean speech training data is usually sufficient for successful denoising. The proposed method compares very favorably to other state-of-the-art systems in both objective and subjective speech quality assessments.

[1]  Robert M. Nickel,et al.  Inventory based speech denoising with hidden Markov models , 2008, 2008 16th European Signal Processing Conference.

[2]  W. Bastiaan Kleijn,et al.  HMM-Based Gain Modeling for Enhancement of Speech in Noise , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  W. Bastiaan Kleijn,et al.  Codebook driven short-term predictor parameter estimation for speech enhancement , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Yi Hu,et al.  Evaluation of objective measures for speech enhancement , 2006, INTERSPEECH.

[5]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[6]  Qin Yan,et al.  Noisy Speech Enhancement Using Harmonic-Noise Model and Codebook-Based Post-Processing , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  J. G. Gander,et al.  An introduction to signal detection and estimation , 1990 .

[8]  Athanasios Mouchtaris,et al.  A Spectral Conversion Approach to Single-Channel Speech Enhancement , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  H. Vincent Poor,et al.  An Introduction to Signal Detection and Estimation , 1994, Springer Texts in Electrical Engineering.

[10]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[11]  Yi Hu,et al.  Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  David Pearce,et al.  Harmonic tunnelling: tracking non-stationary noises during speech , 2001, INTERSPEECH.

[13]  Yi Hu,et al.  Speech enhancement based on wavelet thresholding the multitaper spectrum , 2004, IEEE Transactions on Speech and Audio Processing.

[14]  D. O'Shauqhnessy Modern methods of speech synthesis , 2007, IEEE Circuits and Systems Magazine.