Vector quantization based on Gaussian mixture models

We model the underlying probability density function of vectors in a database as a Gaussian mixture (GM) model. The model is employed for high rate vector quantization analysis and for design of vector quantizers. It is shown that the high rate formulas accurately predict the performance of model-based quantizers. We propose a novel method for optimizing GM model parameters for high rate performance, and an extension to the EM algorithm for densities having bounded support is also presented. The methods are applied to quantization of LPC parameters in speech coding and we present new high rate analysis results for band-limited spectral distortion and outlier statistics. In practical terms, we find that an optimal single-stage VQ can operate at approximately 3 bits less than a state-of-the-art LSF-based 2-split VQ.

[1]  Russell M. Mersereau,et al.  Coding using Gaussian mixture and generalized Gaussian models , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[2]  Michael I. Jordan,et al.  On Convergence Properties of the EM Algorithm for Gaussian Mixtures , 1996, Neural Computation.

[3]  Allen Gersho,et al.  Asymptotically optimal block quantization , 1979, IEEE Trans. Inf. Theory.

[4]  John S. Collura,et al.  How good is your /spl beta/?-observations on VQ training ratios , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[5]  H. Walker,et al.  An iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions , 1978 .

[6]  Robert M. Gray,et al.  Quantization, classification, and density estimation for Kohonen's Gaussian mixture , 1998, Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225).

[7]  H. Sorenson,et al.  Recursive bayesian estimation using gaussian sums , 1971 .

[8]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[9]  Paul L. Zador,et al.  Asymptotic quantization error of continuous signals and the quantization dimension , 1982, IEEE Trans. Inf. Theory.

[10]  Gilles Celeux,et al.  On Stochastic Versions of the EM Algorithm , 1995 .

[11]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[12]  K. Paliwal,et al.  Quantization of LPC Parameters , 2022 .

[13]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[14]  Biing-Hwang Juang,et al.  Recent developments in the application of hidden Markov models to speaker-independent isolated word recognition , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  Xinhua Zhuang,et al.  Gaussian mixture density modeling, decomposition, and applications , 1996, IEEE Trans. Image Process..

[16]  Rajiv Laroia,et al.  Robust and efficient quantization of speech LSP parameters using structured vector quantizers , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[17]  Parham Zolfaghari,et al.  A formant vocoder based on mixtures of Gaussians , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  Bhaskar D. Rao,et al.  Theoretical analysis of the high-rate vector quantization of LPC parameters , 1995, IEEE Trans. Speech Audio Process..

[19]  Mike Alder,et al.  The EM Algorithm used for Gaussian Mixture Modelling and its Initialization , 1993 .

[20]  F. Itakura Line spectrum representation of linear predictor coefficients of speech signals , 1975 .

[21]  K. Paliwal,et al.  Efficient vector quantization of LPC parameters at 24 bits/frame , 1990 .

[22]  Roar Hagen,et al.  Spectral quantization of cepstral coefficients , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[23]  David L. Neuhoff,et al.  Bennett's integral for vector quantizers , 1995, IEEE Trans. Inf. Theory.

[24]  A. Gray,et al.  Distance measures for speech processing , 1976 .

[25]  David L. Neuhoff,et al.  Asymptotic distribution of the errors in scalar and vector quantizers , 1996, IEEE Trans. Inf. Theory.

[26]  Petter Knagenhjelm Competitive Learning in Robust Communication , 1993 .

[27]  R. Gray Source Coding Theory , 1989 .

[28]  Samy A. Mahmoud,et al.  Efficient search and design procedures for robust multi-stage VQ of LPC parameters for 4 kb/s speech coding , 1993, IEEE Trans. Speech Audio Process..

[29]  Per Hedelin Single stage spectral quantization at 20 bits , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[30]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[31]  Keiichi Tokuda,et al.  Efficient encoding of mel-generalized cepstrum for CELP coders , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.