A spectrally mixed excitation (SMX) vocoder with robust parameter determination

Sinusoidal speech coders have been widely studied for low bit rate coding around 4 kbit/s. However, the estimation error of the sinusoidal model parameters would seriously degrade the speech quality. In general, the estimation errors are caused by the effects of various types of speech signal or background noise. In this paper we propose a sinusoidal speech coder with robust parameter determination methods. They consist of spectro-temporal autocorrelation method for robust pitch determination, frequency shifting method for robust voicing level measurement, and residual-spectrum magnitude coding method for spectral magnitude compensation. From the experimental results, we can find the robustnesses of the proposed techniques. In addition, informal listening test of the synthesized speech confirms the effectiveness of the incorporated schemes.

[1]  Jae S. Lim,et al.  Multiband excitation vocoder , 1988, IEEE Transactions on Acoustics, Speech, and Signal Processing.

[2]  Jean-Pierre Adoul,et al.  Low bit rate speech coding using an improved HSX model , 1997, EUROSPEECH.

[3]  Moo Young Kim,et al.  Pitch estimation using spectral covariance method for low-delay MBE vocoder , 1997, 1997 IEEE Workshop on Speech Coding for Telecommunications Proceedings. Back to Basics: Attacking Fundamental Problems in Speech Coding.

[4]  Ahmet M. Kondoz,et al.  High quality split band LPC vocoder operating at low bit rates , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Redwan Salami,et al.  A toll quality 8 kb/s speech codec for the personal communications system (PCS) , 1994 .

[6]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[7]  Sang Ryong Kim,et al.  Linked split-vector quantizer of LPC parameters , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[8]  W. Bastiaan Kleijn,et al.  Encoding speech using prototype waveforms , 1993, IEEE Trans. Speech Audio Process..