Low bit-rate speech coder based on a long-term model

We present a low bit rate speech coder based on a long-term model (LTM) for voiced speech, and on the WI coder. In the LTM, a periodic input signal undergoes a time-varying spectral shaping representing the evolution of the pitch-cycle waveform. The resulting signal, which has a fixed pitch period but a time-varying pitch-cycle waveform, is multiplied by a time-varying gain function that represents the variation in speech loudness. The resulting signal then undergoes a time-axis warping, which represents the evolution of the pitch period, yielding the output speech signal. The spectral shaping in the proposed coder is based on WI. In WI, speech (or LPC residual) is observed as a continuously evolving sequence of pitch cycle waveforms. A subset of these waveforms is extracted and coded. In the decoder, after inverse quantization, missing waveforms are synthesized by interpolation. The extracted waveforms are normalized to a fixed length and sequentially aligned using a cyclical shift. Then, a two-dimensional surface, called prototype waveform surface or characteristic waveform (CW) is produced from these waveforms.

[1]  Allen Gersho,et al.  Enhanced waveform interpolative coding at low bit-rate , 2001, IEEE Trans. Speech Audio Process..

[2]  R. Cox,et al.  Real-time simulation of adaptive transform coding , 1981 .

[3]  David Malah,et al.  Estimation of the parameters of a long-term model for accurate representation of voiced speech , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  O. Gottesmann,et al.  Dispersion phase vector quantization for enhancement of waveform interpolative coder , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[5]  David Malah,et al.  Dynamic time warping with path control and non-local cost , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 2 - Conference B: Computer Vision & Image Processing. (Cat. No.94CH3440-5).