Fast speaker adaptation: some experiments on different techniques for codebook and HMM parameters estimation

A set of techniques to perform fast speaker adaptation for a large vocabulary, natural-language, speech recognition system are presented. The experimentation has been carried out using a 20000-word, real-time, natural-language speech recognizer for the Italian language. To perform speaker adaptation within the framework of the probabilistic approach to speech recognition two different problems must be addressed: codebook adaptation and hidden Markov model parameters adaptation. The basic idea is to use a set of data collected from several different speakers as a source of a priori knowledge with a small speech sample provided by the new speaker to perform the adaptation task. Several different techniques for codebook adaptation have been tried and discussed.<<ETX>>

[1]  Chris Barry,et al.  Speaker adaptation from a speaker-independent training corpus , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[2]  Frederick Jelinek,et al.  The development of an experimental discrete dictation recognizer , 1985 .

[3]  John E. Shore,et al.  Speaker-dependent isolated word recognition using speaker-independent vector quantization codebooks augmented with speaker-specific data , 1985, IEEE Trans. Acoust. Speech Signal Process..

[4]  Kiyohiro Shikano,et al.  Speaker adaptation through vector quantization , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Richard M. Schwartz,et al.  Improved speaker adaption using text dependent spectral mappings , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[6]  Masafumi Nishimura,et al.  Speaker adaptation method for HMM-based speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[7]  P. D'Orta,et al.  Phoneme classification for real time speech recognition of Italian , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Satoshi Nakamura,et al.  A comparative study of spectral mapping for speaker adaptation , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[9]  Francis Kubala,et al.  Improved Speaker Adaptation Using Text Dependent Spectral Mappin , 1988 .

[10]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Stefano Scarci,et al.  Large-Vocabulary Speech Recognition: A System for the Italian Language , 1988, IBM J. Res. Dev..

[12]  Richard M. Stern,et al.  Unsupervised adaptation to new speakers in feature-based letter recognition , 1984, ICASSP.

[13]  G. Rigoll Speaker adaptation for large vocabulary speech recognition systems using speaker Markov models , 1989, International Conference on Acoustics, Speech, and Signal Processing,.