A minimum-distortion segmentation/LVQ hybrid algorithm for speech recognition

The quick and simple training of learning vector quantization (LVQ) can produce classi ficationpower at least as high as that by a powerful, but complex classifier based on artificial neural networks. However, LVQ is a discriminative training algorithm for a distance classifier handling static (fixed-dimensional) patterns. Thus, an innovative process is required to apply this algorithm to dynamic (variable-durational) speech patterns. To meet this requirement, an HHM/LVQ hybrid algorithm was proposed which integrated HMM (Viterbi) segmentation with LVQ classification. However, this algorithm, using all the possible HMM models for segmentation, produces an enormous number of training tokens, making it difficult to apply to large-scale continuous speech recognition tasks. In this light, we present a new minimum-distortion segmentation (MDS)/discriminative classification hybrid algorithm. The MDS algorithm produces one segmentation and this is used in place of the many HMM segmentations. To make a proper comparison between the two methods we used as our discriminative classifier the same LVQ formulation. For clarity, we refer to this proposed algorithm as an MDS/LVQ hybrid algorithm. Results on the E-set task show that MDS/LVQ, with its significantly reduced training, can achieve discriminative power at least as high as HMM/LVQ.

[1]  E. McDermott,et al.  A hybrid speech recognition system using HMMs with an LVQ-trained codebook , 1990 .

[2]  Shigeru Katagiri,et al.  Speaker-independent large vocabulary word recognition using an LVQ/HMM hybrid algorithm , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[3]  Shigeru Katagiri,et al.  LVQ-based shift-tolerant phoneme recognition , 1991, IEEE Trans. Signal Process..

[4]  Biing-Hwang Juang,et al.  Discriminative analysis of distortion sequences in speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[5]  Esther Levin,et al.  Word recognition using hidden control neural architecture , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[6]  Teuvo Kohonen,et al.  Statistical pattern recognition with neural networks , 1988, Neural Networks.

[7]  Shigeru Katagiri,et al.  A new HMM/LVQ hybrid algorithm for speech recognition , 1990 .

[8]  Hidefumi Sawai The TDNN-LR large-vocabulary and continuous speech recognition system , 1990, ICSLP.

[9]  Alex Waibel,et al.  Phoneme recognition: neural networks vs. hidden Markov models vs. hidden Markov models , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.