A mixed prototype waveform/CELP coder for sub 3 kbit/s

CELP (code-excited linear prediction) analysis-by-synthesis speech coders do not make a distinction between voiced and unvoiced speech frames. For sub-3 kbit/s coding it is necessary to separate unvoiced and voiced frames and code-voiced speech using an inherently periodic scheme. The authors address these problems by using a prototype waveform coder for voiced frames while retaining a CELP algorithm for unvoiced frames. For voiced speech a single residual prototype is selected to represent a section of 25 ms. Prototypes are interpolated across the frame to provide a smooth evolution of amplitude and harmonic content. Two coding schemes for the prototypes are discussed: a pitch harmonic scheme operating in the DFT (discrete Fourier transform) domain and an impulsive codebook time-domain technique. Unvoiced frames are coded using a standard CELP architecture excluding the adaptive codebook search. The overall bit rate using either of the voiced frame coding algorithms is shown to be sub-3 kbit/s for good communications quality speech.<<ETX>>

[1]  Bishnu S. Atal,et al.  Efficient Frequency-Domain Representation of LPC Excitation , 1993 .

[2]  Kuldip K. Paliwal,et al.  Efficient vector quantization of LPC parameters at 24 bits/frame , 1990, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[3]  K. Paliwal,et al.  Efficient vector quantization of LPC parameters at 24 bits/frame , 1990 .

[4]  Kuldip K. Paliwal,et al.  Speech coding at 4 kb/s and lower using single-pulse and stochastic models of LPC excitation , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Ronald W. Schafer,et al.  Real-time digital hardware pitch detector , 1976 .

[6]  Willem Bastiaan Kleijn,et al.  Continuous representations in linear predictive coding , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.