Deleted interpolation and density sharing for continuous hidden Markov models

As one of the most powerful smoothing techniques, deleted interpolation has been widely used in both discrete and semi-continuous hidden Markov model (HMM) based speech recognition systems. For continuous HMMs, most smoothing techniques are carried out on the parameters themselves such as Gaussian mean or covariance parameters. HMMs this paper, we propose to smooth the probability density values instead of the parameters of continuous HMMs. This allows us to use most of the existing smoothing techniques for both discrete and continuous HMMs. We also point out that our deleted interpolation can be regarded as a parameter sharing technique. We further generalize this sharing to the probability density function (PDF) level, in which each PDF becomes a basic unit and can be freely shared across any Markov state. For a wide range of dictation experiments, deleted interpolation reduced the word error rate-by 11% to 23% over other simple parameter smoothing techniques like flooring. Generic PDF sharing further reduced the error rate by 3%.

[1]  Satoshi Takahashi,et al.  Four-level tied-structure for efficient representation of acoustic modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[2]  Steve J. Young,et al.  The use of state tying in continuous speech recognition , 1993, EUROSPEECH.

[3]  R. Haeb-Umbach,et al.  Application of clustering techniques to mixture density modelling for continuous-speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[4]  Mei-Yuh Hwang,et al.  Microsoft Windows highly intelligent speech recognizer: Whisper , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Frederick Jelinek,et al.  Interpolated estimation of Markov source parameters from sparse data , 1980 .

[6]  Mei-Yuh Hwang,et al.  An Overview of the SPHINX-II Speech Recognition System , 1993, HLT.

[7]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[8]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[9]  Mei-Yuh Hwang,et al.  Predicting unseen triphones with senones , 1996, IEEE Trans. Speech Audio Process..