论文信息 - Recursive coding of spectrum parameters

Recursive coding of spectrum parameters

A theoretical analysis of recursive speech spectrum coding, where predictive and finite state schemes are special cases, is presented. We evaluate the spectral distortion (SD) theoretically and design coders that minimize the SD. The analysis rests on three cornerstones: high-rate theory, PDF modeling, and an approximation of SD. A derivation of the mean L/sub 2/-norm distortion of a recursive quantizer operating at high rate is provided. Also, the distortion distribution is supplied. The evaluation of the distortion expressions requires a model of the joint PDF of two consecutive spectrum vectors. The LPC spectrum source considered here has outcomes in a bounded region, and this is taken into account in the choice of model and modeling algorithm. It is further shown how to approximate the SD with an L/sub 2/-norm measure. Combining the results, we show theoretically that 16 bits are needed to achieve an average SD of 1 dB when quantizing ten-dimensional (10-D) spectrum vectors using a first-order recursive scheme. A gain of six bits per frame is noted compared to memoryless quantization. These results rely on high-rate assumptions which are validated in experiments. There, actual high-rate optimal coders are designed and evaluated.

Jonas Samuelsson | Per Hedelin

[1] Robert M. Gray,et al. High-resolution quantization theory and the vector quantizer advantage , 1989, IEEE Trans. Inf. Theory.

[2] Robert M. Gray,et al. Asymptotically optimal quantizers (Corresp.) , 1977, IEEE Trans. Inf. Theory.

[3] Jan Skoglund,et al. Performance bounds for LPC spectrum quantization , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[4] Yair Shoham. Vector predictive quantization of the spectral parameters for low rate speech coding , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5] Allen Gersho,et al. Asymptotically optimal block quantization , 1979, IEEE Trans. Inf. Theory.

[6] Thomas Eriksson,et al. Interframe LSF quantization for noisy channels , 1999, IEEE Trans. Speech Audio Process..

[7] David L. Neuhoff,et al. Quantization , 2022, IEEE Trans. Inf. Theory.

[8] Paul L. Zador,et al. Asymptotic quantization error of continuous signals and the quantization dimension , 1982, IEEE Trans. Inf. Theory.

[9] R. Redner,et al. Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[10] Jan Skoglund,et al. Predictive VQ for noisy channel spectrum coding: AR or MA? , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11] R. Gray,et al. Speech coding based upon vector quantization , 1980, ICASSP.

[12] Roar Hagen,et al. Spectral quantization of cepstral coefficients , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[13] Kuldip K. Paliwal,et al. An Introduction to Speech Coding , 1995 .

[14] K. Paliwal,et al. Efficient vector quantization of LPC parameters at 24 bits/frame , 1990 .

[15] J. Makhoul,et al. Quantization properties of transmission parameters in linear predictive systems , 1975 .

[16] K. Paliwal,et al. Quantization of LPC Parameters , 2022 .

[17] Robert M. Gray,et al. An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[18] E. Parzen. 1. Random Variables and Stochastic Processes , 1999 .

[19] David L. Neuhoff,et al. Asymptotic distribution of the errors in scalar and vector quantizers , 1996, IEEE Trans. Inf. Theory.

[20] Per Hedelin,et al. Model based spectrum prediction , 2000, 2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421).

[21] Robert M. Gray,et al. An Algorithm for the Design of Labeled-Transition Finite-State Vector Quantizers , 1985, IEEE Trans. Commun..

[22] Bhaskar D. Rao,et al. PDF optimized parametric vector quantization of speech line spectral frequencies , 2003, IEEE Trans. Speech Audio Process..

[23] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[24] Aaron D. Wyner,et al. Coding Theorems for a Discrete Source With a Fidelity CriterionInstitute of Radio Engineers, International Convention Record, vol. 7, 1959. , 1993 .

[25] Per Hedelin. Single stage spectral quantization at 20 bits , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[26] Jan Skoglund,et al. Vector quantization based on Gaussian mixture models , 2000, IEEE Trans. Speech Audio Process..

[27] Bhaskar D. Rao,et al. Theoretical analysis of the high-rate vector quantization of LPC parameters , 1995, IEEE Trans. Speech Audio Process..

[28] A. Gray,et al. Distance measures for speech processing , 1976 .

[29] Robert M. Gray,et al. Asymptotic Performance of Vector Quantizers with a Perceptual Distortion Measure , 1997, IEEE Trans. Inf. Theory.