Word recognition using neural nets, multi-state Gaussian and k-nearest neighbor classifiers

The author compares the performances of artificial neural networks, the multistate Gaussian classifier, and the k-nearest neighbor classifier for speaker-independent isolated-word recognition over the telephone network. The performances of the classifiers are compared using training and testing data collected from real customers calling one directory assistance office. The approaches described are evaluated on the 25 most frequently requested city names in terms of the first-choice accuracy and the robustness of the decision making (the separation between first and second word candidates). An artificial neural network achieved 92.9% first-choice accuracy, a multistate Gaussian classifier obtained 86.6%, and 91.7% was achieved by the k-nearest neighbor classifier. The discrimination ratio between the first and second word candidates was significantly higher for the artificial neural net, yielding values of 3.5 vs. 1.4 and 1.2 for the Gaussian and k-nearest neighbor classifiers, respectively.<<ETX>>

[1]  D. Lubensky,et al.  Continuous digit recognition using coarse phonetic segmentation , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Kuldip K. Paliwal Neural net classifiers for robust speech recognition under noisy environments , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[3]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[4]  M. A. Bush,et al.  How limited training data can allow a neural network to outperform an 'optimal' statistical classifier , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[5]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[6]  Esther Levin,et al.  Word recognition using hidden control neural architecture , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[7]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[8]  D. Lubensky,et al.  Learning spectral-temporal dependencies using connectionist networks , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.