Variable Frame Rate Transmission: A Review of Methodology and Application to Narrow-Band LPC Speech Coding

We review the variable frame rate (VFR) transmission methodology that we developed, implemented, and tested during the period 1973-1978 for efficiently transmitting LPC vocoder parameters extracted from the input speech at a fixed frame rate. In the VFR method, parameters are transmitted only when their values have changed sufficiently over the interval since their preceding transmission. We explored two distinct approaches to automatic implementation of the VFR method. The first approach bases the transmission decisions on comparisons of the parameter values of the present frame and the last transmitted frame. The second approach, which is based on a functional perceptual model of speech, compares the parameter values of all the frames that lie in the interval between the present frame and the last transmitted frame against a linear model of parameter variation over that interval. The application of VFR transmission to the design of narrow-band LPC speech coders with average bit rates of 2000-2400 bits/s is also considered. The transmission decisions are made separately for the three sets of LPC parameters, pitch, gain, and spectral parameters, using separate VFR schemes. A formal subjective spccch quality test of six selected LPC coders is described, and the results are presented and analyzed in detail. It is shown that a 2075 bit/s VFR coder produces speech quality equal to or better than that of a 5700 bit/s fixed frame rate coder.

[1]  L. Ehrman,et al.  Analysis of some redundancy removal bandwidth compression techniques , 1967 .

[2]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[3]  Lee D. Davisson Data compression using straight line interpolation , 1968, IEEE Trans. Inf. Theory.

[4]  J. Makhoul,et al.  Quantization properties of transmission parameters in linear predictive systems , 1975 .

[5]  N. S. Jayant,et al.  B.S.T.J. brief: Adaptive aperture coding for speech waveforms — II , 1980, The Bell System Technical Journal.

[6]  Joseph P. Olive,et al.  Speech resynthesis from phoneme-related parameters. , 1975 .

[7]  M. G. Schachtman,et al.  Tasi quality — Effect of speech detectors and interpolation , 1962 .

[8]  N. S. Jayant,et al.  Adaptive aperture coding for speech waveforms — I , 1979, The Bell System Technical Journal.

[9]  John Makhoul,et al.  Towards a minimally redundant linear predictive vocoder , 1974 .

[10]  R. Viswanathan,et al.  The application of a functional perceptual model of speech to variable-rate LPC systems , 1977 .

[11]  John Makhoul,et al.  Speech‐quality testing of variable frame rate (VFR) linear predictive (LPC) vocoders , 1976 .

[12]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[13]  M. R. Sambur An efficient linear-prediction vocoder , 1975, The Bell System Technical Journal.

[14]  P. Brady Effects of transmission delay on conversational behavior on echo-free telephone circuits , 1971 .

[15]  R. Viswanathan,et al.  Objective speech quality evaluation of narrowband LPC vocoders , 1978, ICASSP.

[16]  John Makhoul,et al.  Narrowband LPC speech transmission over noisy channels , 1979, ICASSP.