A Bayesian approach to speaker adaptation for the stochastic segment model

Speaker adaptation is frequently used to achieve good speech recognition performance without the high costs associated with training a speaker-dependent model. The main goal of this study is to investigate speaker adaptation for recognizers using multivariate Gaussian densities, specifically, the stochastic segment model. A Bayesian approach is followed, with estimation of the parameters of a speaker-adapted model based on prior densities obtained from speaker-independent data. Experimental results achieve 16% error reduction using mean adaptation with roughly 3 min of speech, nearly half the difference between speaker-independent and speaker-dependent recognition rates.<<ETX>>

[1]  Herbert Gish,et al.  Stochastic segment modelling using the estimate-maximize algorithm (speech recognition) , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[2]  S. Rocous,et al.  Stochastic segment modeling using the estimate-maximize algorithm , 1988 .

[3]  Mari Ostendorf,et al.  Integration of Diverse Recognition Methodologies Through Reevaluation of N-Best Sentence Hypotheses , 1991, HLT.

[4]  H. Raiffa,et al.  Applied Statistical Decision Theory. , 1961 .

[5]  J. Makhoul,et al.  Iterative normalization for speaker-adaptive training in continuous speech recognition , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[6]  Peter Regel-Brietzmann,et al.  Fast speaker adaptation for speech recognition systems , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[7]  Saduoki Furui Unsupervised speaker adaptation based on hierarchical spectral clustering , 1989, IEEE Trans. Acoust. Speech Signal Process..

[8]  R. Schwartz,et al.  Rapid speaker adaptation using a probabilistic spectral mapping , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Biing-Hwang Juang,et al.  A study on speaker adaptation of the parameters of continuous density hidden Markov models , 1991, IEEE Trans. Signal Process..

[10]  Mari Ostendorf,et al.  A stochastic segment model for phoneme-based continuous speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..