论文信息 - Speaker-independent recognition of spoken English letters

Speaker-independent recognition of spoken English letters

A description is presented of EAR, an English alphabet recognizer that performs speaker-independent recognition of letters spoken in isolation. During recognition, (a) signal processing routines transform the digitized speech into useful representations, (b) rules are applied to the representations to locate segment boundaries, (c) feature measurements are computed on the speech segments, and (d) a neural network uses the feature measurements to classify the letter. The system was trained on one token of each letter from 120 speakers. Performance was 95% when tested on a new set of 30 speakers. Performance was 96% when tested on a second token of each letter from the original 120 speakers. The recognition accuracy is 6% higher than that of previously reported systems. The high level of performance is attributed to accurate and explicit phonetic segmentation, the use of speech knowledge to select features that measure the important linguistic information, and the ability of the neural classifier to model the variability of the data

Ronald A. Cole | Mark A. Fanty | Yeshwant K. Muthusamy | M. Gopalakrishnan

[1] Ronald A. Cole,et al. Classification of pitch periods using expert knowledge and neural net classifiers , 1988 .

[2] L. Hou,et al. Segmentation and broad classification of continuous speech , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[3] Biing-Hwang Juang,et al. Statistical segmentation and word modeling techniques in isolated word recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[4] Lawrence R. Rabiner,et al. Some performance benchmarks for isolated work speech recognition systems , 1987 .

[5] D. Casasent,et al. Image processing for image understanding with neural nets , 1989, International 1989 Joint Conference on Neural Networks.

[6] Ronald A. Cole,et al. Performing fine phonetic distinctions: templates versus features , 1990 .

[7] Ronald A. Cole,et al. Feature-based speaker-independent recognition of isolated english letters , 1983, ICASSP.

[8] Peter F. Brown,et al. The acoustic-modeling problem in automatic speech recognition , 1987 .

[9] Nancy A. Daly,et al. Recognition of words from their spellings : integration of multiple knowledge sources , 1987 .

[10] Geoffrey E. Hinton,et al. A time-delay neural network architecture for isolated word recognition , 1990, Neural Networks.