A study on speaker adaptation of large vocabulary

In this paper, the authors propose a speaker adaptation algorithm. There can be a difference of recognition result by a speaker's characteristics although a speaker independent system has a overall good performance. MAP (maximum a posterior) formulation is developed to adapt the characteristics of a speaker with estimation of the HMM (hidden Markov model) parameters from the training data. The proposed adaptation algorithm is evaluated in a large-vocabulary continuous speech recognition. In the experiment, the authors compare the recognition accuracy of the adapted acoustic models. In the experimental results, the MAP algorithm achieves up to about 40% additional reduction of error in phoneme recognition.

[1]  Julius T. Tou,et al.  Pattern Recognition Principles , 1974 .

[2]  Richard M. Stern,et al.  Speaker adaptation in continuous speech recognition via estimation of correlated mean vectors , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[3]  Yunxin Zhao,et al.  An acoustic-phonetic-based speaker adaptation technique for improving speaker-independent continuous speech recognition , 1994, IEEE Trans. Speech Audio Process..

[4]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[5]  Kai-Fu Lee,et al.  On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[6]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[7]  Mari Ostendorf,et al.  A Bayesian approach to speaker adaptation for the stochastic segment model , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Peter No,et al.  Digital Coding of Waveforms , 1986 .