论文信息 - Prototype-based MCE/GPD training for word spotting and connected word recognition

Prototype-based MCE/GPD training for word spotting and connected word recognition

A straightforward application of PBMEC (prototype-based minimum error classifier) training to existing techniques for handling continuous speech is described. A novel MCE/GPD (minimum classification error/generalized probabilistic descent) loss function that can incorporate word spotting errors and other measures of symbolic distance between correct and incorrect categories is defined. Classification consists in a time-synchronous DTW (dynamic time warping) pass through a finite state machine; adaptation makes use of an A* based N-best algorithm and consists in propagating the derivative of the loss over the N best paths through the finite state machine. The key feature is that the loss function being optimized closely reflects the actual recognition performance of the system.<<ETX>>

Shigeru Katagiri | Erik McDermott

[1] Shigeru Katagiri,et al. Prototype-based discriminative training for various speech units , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2] Shigeru Katagiri,et al. A generalized probabilistic descent method , 1990 .

[3] Alex Waibel,et al. Integrating time alignment and neural networks for high performance continuous speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[4] Chin-Hui Lee,et al. Segmental GPD training of HMM based speech recognizer , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5] Biing-Hwang Juang,et al. New discriminative training algorithms based on the generalized probabilistic descent method , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.

[6] S. Amari. A Theory ofAdaptive Pattern Classifiers , 1967 .

[7] Shigeru Katagiri,et al. Application of a generalized probabilistic descent method to dynamic time warping-based speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8] Biing-Hwang Juang,et al. Discriminative learning for minimum error classification [pattern recognition] , 1992, IEEE Trans. Signal Process..

[9] Steve Young,et al. The use of syntax and multiple alternatives in the VODIS voice operated database inquiry system , 1991 .

[10] Shigeki Sagayama,et al. Appropriate error criterion selection for continuous speech HMM minimum error training , 1992, ICSLP.

[11] Frank K. Soong,et al. A Tree.Trellis Based Fast Search for Finding the N Best Sentence Hypotheses in Continuous Speech Recognition , 1990, HLT.

[12] Biing-Hwang Juang,et al. Discriminative template training for dynamic programming speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.