论文信息 - Experiments for isolated-word recognition with single- and two-layer perceptrons

Experiments for isolated-word recognition with single- and two-layer perceptrons

Abstract Several design strategies for feed-forward networks are examined within the scope of pattern classification. Single- and two-layer perceptron models are adapted for experiments in isolated-word recognition. Direct (one-step) classification as well as several hierarchical (two-step) schemes have been considered. For a vocabulary of 20 English words spoken repeatedly by 11 speakers, the word classes are found to be separable by hyperplanes in the chosen feature space. Since for speaker-dependent word recognition the underlying data base contains only a small training set, an automatic expansion of the training material improves the generalization properties of the networks. This method accounts for a wide variety of observable temporal structures for each word and gives a better overall estimate of the network parameters which leads to a recognition rate of 99.5%. For speaker-independent word recognition, a hierarchical structure with pairwise training of two-class models is superior to a single uniform network (98% average recognition rate).

Bernhard R. Kämmerer | Wolfgang A. Küpper

[1] S. N. Srihari,et al. Neural network models and their application to handwritten digit recognition , 1988, IEEE 1988 International Conference on Neural Networks.

[2] S. Chiba,et al. Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[3] J. Tanahashi,et al. Large-vocabulary spoken word recognition using simplified time-warping patterns , 1982, ICASSP.

[4] Teuvo Kohonen,et al. Self-Organization and Associative Memory , 1988 .

[5] Alex Waibel,et al. Phoneme Recognition: Neural Networks vs , 1988 .

[6] Anthony Bladon,et al. Acoustic phonetics, auditory phonetics, speaker sex and speech recognition: a thread , 1986 .

[7] A. A. Mullin,et al. Principles of neurodynamics , 1962 .

[8] L. R. Rabiner,et al. On the application of vector quantization and hidden Markov models to speaker-independent, isolated word recognition , 1983, The Bell System Technical Journal.

[9] Abdulmesih Aktas,et al. Large-vocabulary isolated word recognition with fast coarse time alignment , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10] Raymond L. Watrous. Learning Algorithms for Connectionist Networks: Applied Gradient Methods of Nonlinear Optimization , 1988 .

[11] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[12] Alex Waibel,et al. Phoneme recognition: neural networks vs. hidden Markov models vs. hidden Markov models , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[13] D. Lubensky,et al. Learning spectral-temporal dependencies using connectionist networks , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[14] J. Orbach. Principles of Neurodynamics. Perceptrons and the Theory of Brain Mechanisms. , 1962 .

[15] Richard Lippmann,et al. Neural Net and Traditional Classifiers , 1987, NIPS.

[16] Bernhard R. Kämmerer,et al. Special feature vector coding and appropriate distance definition developed for a speech recognition system , 1984, ICASSP.

[17] Richard P. Lippmann,et al. A neural net approach to speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[18] G. R. Doddington,et al. Computers: Speech recognition: Turning theory to practice: New ICs have brought the requisite computer power to speech technology; an evaluation of equipment shows where it stands today , 1981, IEEE Spectrum.