论文信息 - Predictor codebooks for speaker-independent speech recognition

Predictor codebooks for speaker-independent speech recognition

The authors examine the speech recognition capabilities of predictor codebooks under multi-speaker and speaker-independent conditions. Three structures of spectrum predictors, a forward predictor, a backward predictor, and an interpolator, are examined. Predictor codebooks are generated by the LBG algorithm with a small modification for predictor quantization. The predictor codebooks are then tested on a phone recognition task with three different measurements. The degradation in predictor-codebook performance was reduced by one-third under speaker-independent conditions. Finally, continuous-speech recognition experiments are carried out using the predictor codebook for multi-speaker and speaker-independent conditions. The results show that the backward-predictor codebook is very effective.<<ETX>>

Takeshi Kawabata

[1] L. R. Rabiner,et al. An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.

[2] A. B. Poritz,et al. Linear predictive hidden Markov models and the speech signal , 1982, ICASSP.

[3] Kenji Kita,et al. HMM continuous speech recognition using predictive LR parsing , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[4] Yair Shoham. Vector predictive quantization of the spectral parameters for low rate speech coding , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5] Ken-ichi Iso,et al. Large vocabulary speech recognition using neural prediction model , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[6] Manfred R. Schroeder,et al. Code-excited linear prediction(CELP): High-quality speech at very low bit rates , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[8] Robert M. Gray,et al. An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[9] Patrick Kenny,et al. A linear predictive HMM for vector-valued observations with applications to speech recognition , 1990, IEEE Trans. Acoust. Speech Signal Process..

[10] Kiyohiro Shikano,et al. Speaker-independent isolated word recognition based on multiple templates using split method , 1985, Systems and Computers in Japan.