论文信息 - Continuous optical automatic speech recognition by lipreading

Continuous optical automatic speech recognition by lipreading

We describe a continuous optical automatic speech recognizer (OASR) that uses optical information from the oral-cavity shadow of a speaker. The system achieves a 25.3 percent recognition on sentences having a perplexity of 150 without using any syntactic, semantic, acoustic, or contextual guides. We introduce 13, mostly dynamic, oral-cavity features used for optical recognition, present phones that appear optically similar (visemes) for our speaker, and present the recognition results for our hidden Markov models (HMMs) using visemes, trisemes, and generalized trisemes. We conclude that future research is warranted for optical recognition, especially when combined with other input modalities.<<ETX>>

[1] L. R. Rabiner,et al. A probabilistic distance measure for hidden Markov models , 1985, AT&T Technical Journal.

[2] C. G. Fisher,et al. Confusions among visually perceived consonants. , 1968, Journal of speech and hearing research.

[3] B. Walden,et al. Effects of training on the visual recognition of consonants. , 1977, Journal of speech and hearing research.

[4] E. Petajan,et al. An improved automatic lipreading system to enhance speech recognition , 1988, CHI '88.

[5] L. R. Rabiner,et al. An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.

[6] G. E. Peterson,et al. Control Methods Used in a Study of the Vowels , 1951 .

[7] Eric David Petajan,et al. Automatic Lipreading to Enhance Speech Recognition (Speech Reading) , 1984 .

[8] Elizabeth Hazard,et al. Lipreading: For the Oral Deaf and Hard-Of-Hearing Person , 1971 .

[9] Kai-Fu Lee,et al. Automatic Speech Recognition , 1989 .

[10] Michael R. Anderberg,et al. Cluster Analysis for Applications , 1973 .

[11] Q. Summerfield. Some preliminaries to a comprehensive account of audio-visual speech perception. , 1987 .

[12] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.