论文信息 - Gaussian mixture selection using context-independent HMM

Gaussian mixture selection using context-independent HMM

We address a method to efficiently select Gaussian mixtures for fast acoustic likelihood computation. It makes use of context-independent models for selection and back-off of corresponding triphone models. Specifically, for the k-best phone models by the preliminary evaluation, triphone models of higher resolution are applied, and others are assigned likelihoods with the monophone models. This selection scheme assigns more reliable back-off likelihoods to the un-selected states than the conventional Gaussian selection based on a VQ codebook. It can also incorporate efficient Gaussian pruning at the preliminary evaluation, which offsets the increased size of the pre-selection model. Experimental results show that the proposed method achieves comparable performance as the standard Gaussian selection, and performs much better under aggressive pruning condition. Together with the phonetic tied-mixture modeling, acoustic matching cost is reduced to almost 14% with little loss of accuracy.

Kiyohiro Shikano | Tatsuya Kawahara | Akinobu Lee

[1] Tatsuya Kawahara,et al. An efficient two-pass search algorithm using word trellis index , 1998, ICSLP.

[2] Enrico Bocchieri,et al. Vector quantization for the efficient computation of continuous density likelihoods , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3] Mark J. F. Gales,et al. Use of Gaussian selection in large vocabulary continuous speech recognition using HMMS , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4] Kiyohiro Shikano,et al. A new phonetic tied-mixture model for efficient decoding , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[5] Nobuaki Minematsu,et al. Free software toolkit for Japanese large vocabulary continuous speech recognition , 2000, INTERSPEECH.

[6] Mark J. F. Gales,et al. State-based Gaussian selection in large vocabulary continuous speech recognition using HMMs , 1999, IEEE Trans. Speech Audio Process..