Influence of outliers in training the parametric trajectory models for speech recognition

In this study, we developed a modi ed maximum likelihood (ML) algorithm for e cient computation in implemeting the minimum classi cation error (MCE) like training for optimally estimating the state-dependent polynomial coefcients in the trended HMM. We devised a new discriminative training method which controls the in uence of outliers in the training data on the constructed models. The resulting models seem to provide correct recognition for confusable patterns. For alphabet recognition tasks, outlier emphasis resulted in improved performance. An error rate reduction of 14% is achieved for the linear trend and 7.5% is obtained for the constant trend models over the traditional ML training models.

[1]  Herbert Gish,et al.  Parametric trajectory models for speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[2]  Haizhou Li,et al.  On MMI learning of Gaussian mixture for speaker models , 1995, EUROSPEECH.

[3]  Biing-Hwang Juang,et al.  A Minimum Error Rate Pattern Recognition Approach to Speech Recognition , 1994, Int. J. Pattern Recognit. Artif. Intell..

[4]  S. Young,et al.  Lattice-based discriminative training for large vocabulary speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[5]  Biing-Hwang Juang,et al.  New discriminative training algorithms based on the generalized probabilistic descent method , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.

[6]  John H. L. Hansen,et al.  Improved HMM training and scoring strategies with application to accent classification , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[7]  Li Deng,et al.  Speech trajectory discrimination using the minimum classification error learning , 1998, IEEE Trans. Speech Audio Process..

[8]  Xiaodong Sun,et al.  Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states , 1994, IEEE Trans. Speech Audio Process..

[9]  Kuldip K. Paliwal,et al.  Model parameter estimation for mixture density polynomial segment models , 1998, Comput. Speech Lang..

[10]  M.J. Russell,et al.  Linear trajectory segmental HMMs , 1997, IEEE Signal Processing Letters.

[11]  Oded Ghitza,et al.  Hidden Markov models with templates as non-stationary states: an application to speech recognition , 1993, Comput. Speech Lang..