Efficient speech recognition using subvector quantization and discrete-mixture HMMs

This paper introduces a new form of observation distributions for hidden Markov models (HMMs), combining subvector quantization and mixtures of discrete distributions. We present efficient training and decoding algorithms for the discrete-mixture HMMs (DMHMMs). Our experimental results in the air-travel information domain show that the high-level of recognition accuracy of continuous mixture-density HMMs (CDHMMs) can be maintained at significantly faster decoding speeds. Moreover, we show that when the same number of mixture components is used in DMHMMs and CDHMMs, the new models exhibit superior recognition performance.

[1]  Vassilios Digalakis,et al.  Quantization of cepstral parameters for speech recognition over the World Wide Web , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  Satoshi Takahashi,et al.  Discrete mixture HMM , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  R. Gray,et al.  Product code vector quantizers for waveform and voice coding , 1984 .

[4]  P. J. Price,et al.  Evaluation of Spoken Language Systems: the ATIS Domain , 1990, HLT.

[5]  Brian Kan-Wing Mak,et al.  Subspace distribution clustering for continuous observation density hidden Markov models , 1997, EUROSPEECH.

[6]  Robert M. Gray,et al.  An Improvement of the Minimum Distortion Encoding Algorithm for Vector Quantization , 1985, IEEE Trans. Commun..

[7]  Vassilios Digalakis,et al.  Genones: optimizing the degree of mixture tying in a large vocabulary hidden Markov model based speech recognizer , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Vassilios Digalakis,et al.  Product-code vector quantization of cepstral parameters for speech recognition over the WWW , 1998, ICSLP.

[9]  Vassilios Digalakis,et al.  Genones: generalized mixture tying in continuous hidden Markov model-based speech recognizers , 1996, IEEE Trans. Speech Audio Process..