A connectionist model for phoneme recognition in continuous speech
暂无分享,去创建一个
A connectionist structure for phoneme recognition in continuous speech is described. This has two main parts. The first is a sound subunit classifier in the form of a three-layer back propagation network which classifies speech subunits from frames of spectral speech data. This is followed by a sequence classifier in the form of a network of neural like-units which classifies phonemes from input sequences of subunits by their occurrence and duration. Results are given for a 15-phoneme subset of British English, for a single speaker. These include the difficult syllable initial and final stop consonants, fricatives, vowels, and diphthongs. The overall recognition accuracy achieved is 87%.<<ETX>>
[1] P. H. Lindsay,et al. Human Information Processing: An Introduction to Psychology , 1972 .
[2] J J Hopfield,et al. Neural computation by concentrating information in time. , 1987, Proceedings of the National Academy of Sciences of the United States of America.
[3] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..
[4] D Zipser,et al. Learning the hidden structure of speech. , 1988, The Journal of the Acoustical Society of America.