A 4 kb/s hybrid MELP/CELP coder with alignment phase encoding and zero-phase equalization

The paper describes a hybrid multi-modal codec with MELP and CELP coders used for different speech regions. Three modes are used: strongly voiced, weakly voiced, and unvoiced. The weakly voiced mode includes transitions and plosives; it is used when neither strong voicing nor unvoiced region are clearly identified. In the strongly voiced mode the MELP coder is used, while in the weakly voiced and unvoiced modes the CELP coder is employed. To limit switching artifacts between the coders, alignment phase is estimated and transmitted in the MELP mode making the original and MELP-synthesized speech time-synchronous. Additionally, in zero-phase equalization, the phase component of the CELP target signal is removed making the target waveform more similar to the MELP-synthesized speech. These two techniques, alignment-phase encoding and zero-phase equalization, greatly reduce switching artifacts in MELP/CELP transition regions. Formal listening test results of the 4 kb/s hybrid coder show that it can achieve speech quality equivalent to 32 kb/s ADPCM.

[1]  D. Prezas,et al.  Selective modeling of the LPC residual during unvoiced frames: White noise or pulse excitation , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Alan McCree,et al.  High quality MELP coding at bit-rates around 4 kb/s , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[3]  W. Bastiaan Kleijn,et al.  Encoding speech using prototype waveforms , 1993, IEEE Trans. Speech Audio Process..

[4]  Manfred R. Schroeder,et al.  Code-excited linear prediction(CELP): High-quality speech at very low bit rates , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Takehiro Moriya,et al.  Speech coder using phase equalization and vector quantization , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  W.B. Kleijn,et al.  Transformation and decomposition of the speech signal for coding , 1994, IEEE Signal Processing Letters.

[7]  Thomas P. Barnwell,et al.  MCCREE AND BARNWELL MIXED EXCITAmON LPC VOCODER MODEL LPC SYNTHESIS FILTER 243 SYNTHESIZED SPEECH-PERIODIC PULSE TRAIN-1 PERIODIC POSITION JITTER PULSE 4 , 2004 .

[8]  Isabel Trancoso,et al.  A study on the realtionships between stochastic and harmonic coding , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Thomas F. Quatieri,et al.  Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..

[10]  Allen Gersho,et al.  Combined harmonic and waveform coding of speech at low bit rates , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).