A study on speaker adaptation of continuous density HMM parameters

For a speech recognition system based on a continuous density hidden Markov model (CDHMM), it is shown that speaker adaptation of the parameters of the CDHMM can be formulated as a Bayesian learning procedure and it can be integrated into the segmental k-means training algorithm. Some results are reported for adapting both the mean and the diagonal covariance matrix of the Gaussian state observation densities of a CDHMM. When the speaker adaptation procedure is tested on a 39-word English alpha-digit vocabulary in isolated word mode, the results indicate that the procedure achieves better performance than a speaker-independent system, when only one training token from each word is used to perform speaker adaptation. It is also shown that much better performance can be achieved when two or more training tokens are used for speaker adaptation.<<ETX>>

[1]  Lawrence R. Rabiner,et al.  Some performance benchmarks for isolated work speech recognition systems , 1987 .

[2]  Lawrence R. Rabiner,et al.  A segmental k-means training procedure for connected word recognition , 1986, AT&T Technical Journal.

[3]  Kiyohiro Shikano,et al.  Speaker adaptation through vector quantization , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Richard M. Stern,et al.  Dynamic speaker adaptation for feature-based isolated word recognition , 1987, IEEE Trans. Acoust. Speech Signal Process..

[5]  Chin-Hui Lee,et al.  Bayesian adaptation in speech recognition , 1983, ICASSP.

[6]  R. Schwartz,et al.  Rapid speaker adaptation using a probabilistic spectral mapping , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  D. B. Roe,et al.  Improved training procedures for hidden Markov models , 1988 .

[8]  M. Degroot Optimal Statistical Decisions , 1970 .