论文信息 - Segmental optical phonetics for human and machine speech processing

Segmental optical phonetics for human and machine speech processing

That talkers produce optical as well as acoustic speech signals, and that perceivers process both types of signals has become well known. Although perceptual effects due to audiovisual speech integration have been a focus of research involving the visual speech stimulus, relatively little is known about visual-only speech perception and optical phonetic signals. This knowledge is needed to exploit optical signals for applications such as synthetic artificial talking heads and audiovisual ASR. One important practical concern is the wide variation in performance among individual visual perceivers and talkers. This paper focuses on variation in visual phonetic perception, phoneme distinctiveness and word recognition. The paper also introduces a project linking optical phonetics, speech kinematics, and perception.

Lynne E. Bernstein

[1] L. Bernstein,et al. Generalizability of speechreading performance on nonsense syllables, words, and sentences: subjects with normal hearing. , 1996, Journal of speech and hearing research.

[2] L. Bernstein,et al. Speech perception without hearing , 2000, Perception & psychophysics.

[3] Victor Zue,et al. A model of lexical access from partial phonetic information , 1984, ICASSP.

[4] Abeer Alwan,et al. On the correlation between facial movements, tongue movements and speech acoustics , 2000, INTERSPEECH.

[5] L. Bernstein,et al. Development of a facility for simultaneous recordings of acoustic, optical (3‐D motion and video), and physiological speech data , 2000 .

[6] L. Bernstein,et al. Speechreading and the structure of the lexicon: computationally modeling the effects of reduced phonetic distinctiveness on lexical uniqueness. , 1997, The Journal of the Acoustical Society of America.