Robust speaker adaptation using a piecewise linear acoustic mapping

In a large vocabulary speech recognition system, it is desirable to make use of previously acquired speech data when encountering new speakers. The authors describe an adaptation strategy based on a piecewise linear mapping between the feature space of a new speaker and that of a reference speaker. This speaker-normalizing mapping is used to transform the previously acquired parameters of the reference speaker onto the space of the new speaker. This results in a robust speaker adaptation procedure which allows for a drastic reduction in the amount of training data required from the new speaker. The performance of this method is illustrated on an isolated utterance speech recognition task with a vocabulary of 20000 words.<<ETX>>

[1]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Frederick Jelinek,et al.  The development of an experimental discrete dictation recognizer , 1985 .

[3]  Kiyohiro Shikano,et al.  Speaker adaptation through vector quantization , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Michael Picheny,et al.  Acoustic Markov models used in the Tangora speech recognition system , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[5]  Michael Picheny,et al.  Large vocabulary natural language continuous speech recognition , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[6]  G. Rigoll Speaker adaptation for large vocabulary speech recognition systems using speaker Markov models , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[7]  Chris Barry,et al.  Speaker adaptation from a speaker-independent training corpus , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[8]  Marco Ferretti,et al.  Fast speaker adaptation: some experiments on different techniques for codebook and HMM parameters estimation , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.