Predictor codebooks for speaker-independent speech recognition

The authors examine the speech recognition capabilities of predictor codebooks under multi-speaker and speaker-independent conditions. Three structures of spectrum predictors, a forward predictor, a backward predictor, and an interpolator, are examined. Predictor codebooks are generated by the LBG algorithm with a small modification for predictor quantization. The predictor codebooks are then tested on a phone recognition task with three different measurements. The degradation in predictor-codebook performance was reduced by one-third under speaker-independent conditions. Finally, continuous-speech recognition experiments are carried out using the predictor codebook for multi-speaker and speaker-independent conditions. The results show that the backward-predictor codebook is very effective.<<ETX>>

[1]  L. R. Rabiner,et al.  An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.

[2]  A. B. Poritz,et al.  Linear predictive hidden Markov models and the speech signal , 1982, ICASSP.

[3]  Kenji Kita,et al.  HMM continuous speech recognition using predictive LR parsing , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[4]  Yair Shoham Vector predictive quantization of the spectral parameters for low rate speech coding , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Ken-ichi Iso,et al.  Large vocabulary speech recognition using neural prediction model , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[6]  Manfred R. Schroeder,et al.  Code-excited linear prediction(CELP): High-quality speech at very low bit rates , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[8]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[9]  Patrick Kenny,et al.  A linear predictive HMM for vector-valued observations with applications to speech recognition , 1990, IEEE Trans. Acoust. Speech Signal Process..

[10]  Kiyohiro Shikano,et al.  Speaker-independent isolated word recognition based on multiple templates using split method , 1985, Systems and Computers in Japan.