Low-Complexity Source Coding Using Gaussian Mixture Models, Lattice Vector Quantization, and Recursive Coding with Application to Speech Spectrum Quantization

In this paper, we use the Gaussian mixture model (GMM) based multidimensional companding quantization framework to develop two important quantization schemes. In the first scheme, the scalar quantization in the companding framework is replaced by more efficient lattice vector quantization. Low-complexity lattice pruning and quantization schemes are provided for the$E_8$Gossett lattice. At moderate to high bit rates, the proposed scheme recovers much of the space-filling loss due to the product vector quantizers (PVQ) employed in earlier work, and thereby, provides improved performance with a marginal increase in complexity. In the second scheme, we generalize the compression framework to accommodate recursive coding. In this approach, the joint probability density function (PDF) of the parameter vectors of successive source frames is modeled using a GMM. The conditional density of the parameter vector of the current source frame based on the quantized values of the parameter vector of the previous source frames is used to generate a new codebook for every current source frame. We demonstrate the efficacy of the proposed schemes in the application of speech spectrum quantization. The proposed scheme is shown to provide superior performance with moderate increase in complexity when compared with conventional one-step linear prediction based compression schemes for both narrow-band and wide-band speech.

[1]  Allen Gersho,et al.  s9.9 ENCODING OF LPC SPECTRAL PARAMETERS USING SWITCHED-ADAPTIVE INTERFRAME VECTOR PREDICTION? , 1988 .

[2]  Bhaskar D. Rao,et al.  Comprehensive evaluation of theoretical approximations for spectral quantization performance , 2002, 2002 11th European Signal Processing Conference.

[3]  Ahmet M. Kondoz,et al.  Speaker adaptive vector quantisation of LPC parameters of speech , 1988 .

[4]  Zheng Gao,et al.  Lattice vector quantization of generalized Gaussian sources , 1997, IEEE Trans. Inf. Theory.

[5]  Jonas Samuelsson,et al.  Recursive coding of spectrum parameters , 2001, IEEE Trans. Speech Audio Process..

[6]  W. Fischer,et al.  Sphere Packings, Lattices and Groups , 1990 .

[7]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[8]  Lajos Hanzo,et al.  Voice Compression and Communications , 2001 .

[9]  Jan Skoglund,et al.  Vector quantization based on Gaussian mixture models , 2000, IEEE Trans. Speech Audio Process..

[10]  Robert M. Gray,et al.  An Algorithm for the Design of Labeled-Transition Finite-State Vector Quantizers , 1985, IEEE Trans. Commun..

[11]  Allen Gersho,et al.  Encoding of LPC spectral parameters using switched-adaptive interframe vector prediction (speech coding) , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[12]  Bhaskar D. Rao,et al.  Joint source-channel decoding of speech spectrum parameters over erasure channels using Gaussian mixture models , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[13]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[14]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[15]  David L. Neuhoff,et al.  Quantization , 2022, IEEE Trans. Inf. Theory.

[16]  Jonas Samuelsson Multidimensional companding quantization of the Gaussian source , 2003, IEEE Trans. Inf. Theory.

[17]  Benjamin Belzer,et al.  A comparison of the Z, E/sub 8/, and Leech lattices for quantization of low-shape-parameter generalized Gaussian sources , 1995, IEEE Signal Processing Letters.

[18]  David L. Neuhoff,et al.  Optimal compressor functions for multidimensional companding , 1997, Proceedings of IEEE International Symposium on Information Theory.

[19]  Yair Shoham Vector predictive quantization of the spectral parameters for low rate speech coding , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20]  Bhaskar D. Rao,et al.  Speech spectrum quantization using Gaussian mixture models and multi-dimensional companding , 2002, Speech Coding, 2002, IEEE Workshop Proceedings..

[21]  Thomas C. Hales Sphere packings, I , 1997, Discret. Comput. Geom..

[22]  Bhaskar D. Rao,et al.  PDF optimized parametric vector quantization of speech line spectral frequencies , 2003, IEEE Trans. Speech Audio Process..

[23]  Kuldip K. Paliwal,et al.  Speech Coding and Synthesis , 1995 .

[24]  Jerry D. Gibson,et al.  Uniform and piecewise uniform lattice vector quantization for memoryless Gaussian and Laplacian sources , 1993, IEEE Trans. Inf. Theory.