Speaker-independent spoken digits recognition using LVQ

Presents a spoken Japanese digits recognition system using LVQ (learning vector quantization). LVQ is very effective for phoneme recognition and its algorithm is very simple. The authors try to utilize the LVQ algorithm using a word, not a phoneme, as one unit. Input vectors in the authors' system are the mel-cepstrum coefficients generated from beginning points to end points of spoken digits. In the recognition process the authors only find the closest reference vector to the input vector. Experiments are executed for two cases. One is for some isolated spoken digits. The other is for some continuous spoken digits (the speech speed, V, is 1<V<3 [word/sec]). The recognition rate of isolated spoken digits was 99.2%. That of continuous spoken digits was 95.4%. Experimental results show this method is effective for spoken digits recognition.<<ETX>>

[1]  Shigeru Katagiri,et al.  Shift-invariant, multi-category phoneme recognition using Kohonen's LVQ2 , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[2]  Teuvo Kohonen,et al.  The 'neural' phonetic typewriter , 1988, Computer.