Prototype-based discriminative training for various speech units

It has since been shown that learning vector quantisation (LVQ) is a special case of a more general method, generalized probabilistic descent (GPD), for gradient descent on a rigorously defined classification loss measure that closely reflects the misclassification rate. The authors to extend LVQ into a prototype-based classifier appropriate for the classification of various long speech units. For word recognition, a dynamic time warping procedure is integrated into the GPD learning procedure. The resulting minimum error classifier (MEC) is no longer a purely LVQ-like method, and it is called the prototype-based minimum error classifier (PBMEC). Results for the difficult Bell Labs E-set task as well as for speaker-dependent isolated word recognition for a vocabulary of 5240 words are presented. They reveal clear gains in performance as a result of using PBMEC.<<ETX>>

[1]  T. Kohonen,et al.  Statistical pattern recognition with neural networks: benchmarking studies , 1988, IEEE 1988 International Conference on Neural Networks.

[2]  Alex Waibel,et al.  Consonant recognition by modular construction of large phonemic time-delay neural networks , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[3]  Shigeru Katagiri,et al.  A generalized probabilistic descent method , 1990 .

[4]  E. Mcdermott,et al.  LVQ3 for phoneme recognition , 1990 .

[5]  Shigeru Katagiri,et al.  Shift-tolerant LVQ and hybrid LVQ-HMM for phoneme recognition , 1990 .

[6]  Alex Waibel,et al.  Integrating time alignment and neural networks for high performance continuous speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[7]  H. Sawai TDNN-LR continuous speech recognition system using adaptive incremental TDNN training , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[8]  Shigeru Katagiri,et al.  Speaker-independent large vocabulary word recognition using an LVQ/HMM hybrid algorithm , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[9]  Biing-Hwang Juang,et al.  New discriminative training algorithms based on the generalized probabilistic descent method , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.

[10]  Chin-Hui Lee,et al.  Robustness and discrimination oriented speech recognition using weighted HMM and subspace projection approaches , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[11]  Shigeru Katagiri,et al.  LVQ-based shift-tolerant phoneme recognition , 1991, IEEE Trans. Signal Process..

[12]  Biing-Hwang Juang,et al.  Discriminative analysis of distortion sequences in speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[13]  Biing-Hwang Juang,et al.  Discriminative learning for minimum error classification [pattern recognition] , 1992, IEEE Trans. Signal Process..

[14]  Biing-Hwang Juang,et al.  Discriminative training of dynamic programming based speech recognizers , 1993, IEEE Trans. Speech Audio Process..

[15]  Biing-Hwang Juang,et al.  Discriminative analysis of distortion sequences in speech recognition , 1993, IEEE Trans. Speech Audio Process..