High-Rate Optimized Recursive Vector Quantization Structures Using Hidden Markov Models

This paper examines the design of recursive vector quantization systems built around Gaussian mixture vector quantizers. The problem of designing such systems for minimum high-rate distortion, under input-weighted squared error, is discussed. It is shown that, in high dimensions, the design problem becomes equivalent to a weighted maximum likelihood problem. A variety of recursive coding schemes, based on hidden Markov models are presented. The proposed systems are applied to the problem of wideband speech line spectral frequency (LSF) quantization under the log spectral distortion (LSD) measure. By combining recursive quantization and random coding techniques, the systems are able to attain transparent quality at rates as low as 36 bits per frame

[1]  Bhaskar D. Rao,et al.  Theoretical analysis of the high-rate vector quantization of LPC parameters , 1995, IEEE Trans. Speech Audio Process..

[2]  Jan Skoglund,et al.  Vector quantization based on Gaussian mixture models , 2000, IEEE Trans. Speech Audio Process..

[3]  Tamás Linder,et al.  High-Resolution Source Coding for Non-Difference Distortion Measures: The Rate-Distortion Function , 1997, IEEE Trans. Inf. Theory.

[4]  Sangwon Kang,et al.  Safety-net pyramid VQ of LSF parameters for wideband speech codecs , 2001 .

[5]  Turaj Zakizadeh Shabestary,et al.  LSP quantization by a union of locally trained codebooks , 2005, IEEE Transactions on Speech and Audio Processing.

[6]  Robert M. Gray,et al.  Asymptotic Performance of Vector Quantizers with a Perceptual Distortion Measure , 1997, IEEE Trans. Inf. Theory.

[7]  Nariman Farvardin,et al.  Switched scalar quantizers for hidden Markov sources , 1992, IEEE Trans. Inf. Theory.

[8]  Allen Gersho,et al.  Asymptotically optimal block quantization , 1979, IEEE Trans. Inf. Theory.

[9]  S. Van Gerven,et al.  LSP quantization in wideband speech coders , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[10]  Tamás Linder,et al.  High-Resolution Source Coding for Non-Difference Distortion Measures: Multidimensional Companding , 1999, IEEE Trans. Inf. Theory.

[11]  Jonas Samuelsson,et al.  Recursive coding of spectrum parameters , 2001, IEEE Trans. Speech Audio Process..

[12]  J. N. Kapur,et al.  Entropy optimization principles with applications , 1992 .

[13]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[14]  Bhaskar D. Rao,et al.  PDF optimized parametric vector quantization of speech line spectral frequencies , 2003, IEEE Trans. Speech Audio Process..

[15]  Gene Ott,et al.  Compact encoding of stationary Markov sources , 1967, IEEE Trans. Inf. Theory.

[16]  Bhaskar D. Rao,et al.  Low-Complexity Source Coding Using Gaussian Mixture Models, Lattice Vector Quantization, and Recursive Coding with Application to Speech Spectrum Quantization , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  David L. Neuhoff,et al.  Bennett's integral for vector quantizers , 1995, IEEE Trans. Inf. Theory.

[18]  R. Gray,et al.  Comparison of optimal quantizations of speech reflection coefficients , 1977 .

[19]  Roch Lefebvre,et al.  Low complexity LSF quantization for wideband speech coding , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[20]  Hai Le Vu,et al.  Efficient distance measure for quantization of LSF and its Karhunen-Loeve transformed parameters , 2000, IEEE Trans. Speech Audio Process..

[21]  Biing-Hwang Juang,et al.  Optimal quantization of LSP parameters , 1993, IEEE Trans. Speech Audio Process..

[22]  James A. Bucklew,et al.  Companding and random quantization in several dimensions , 1981, IEEE Trans. Inf. Theory.

[23]  Paul L. Zador,et al.  Asymptotic quantization error of continuous signals and the quantization dimension , 1982, IEEE Trans. Inf. Theory.

[24]  Turaj Zakizadeh Shabestary,et al.  Vector quantization by companding a union of Z-lattices , 2005, IEEE Transactions on Information Theory.