论文信息 - Analysis, synthesis, and perception of visible articulatory movements

Analysis, synthesis, and perception of visible articulatory movements

Abstract: Significant advances in isolating the perceptually important properties of speech sounds followed the development of techniques for acoustical speech synthesis in the early 1950s. Prescription of spectro-temporal acoustical structure made it possible to manipulate individual acoustical parameters and thus to study the ways in which speech sounds were identified and discriminated. This paper reviews attempts to develop analogous facilities for studying the perception of visible, facial, articulatory movements in lipreading. A new approach, involving interrelated procedures for measuring, modelling, and animating displays of a talking face with computer graphics is described, along with a prototypic perceptual experiment. The results suggest that the important visible properties of point vowels may not be fully captured by descriptions of vertical jaw movements, horizontal and vertical oral opening and lip shape, even when vowels are spoken carefully. Additional cues, probably involving the visibility of the teeth and tongue tip appear to be required for accurate identification.

N. M. Brooke | Quentin Summerfield | Q. Summerfield | N. Brooke

[1] P. Ekman,et al. Measuring facial movement , 1976 .

[2] O Fujimura. Modern Methods of Investigation in Speech Production , 1980, Phonetica.

[3] Norman I. Badler,et al. Animating facial expressions , 1981, SIGGRAPH '81.

[4] W. H. Sumby,et al. Visual contribution to speech intelligibility in noise , 1954 .

[5] Björn Lindblom,et al. Analysis of Labial Movement , 1965 .

[6] S G Fletcher,et al. Video-scanning system for measurement of lip and jaw motion. , 1977, The Journal of the Acoustical Society of America.

[7] Frederic I. Parke. A model for human faces that allows speech synchronized animation , 1975, Comput. Graph..

[8] N. P. Erber,et al. Voice/mouth synthesis and tactual/visual perception of /pa, ba, ma/. , 1978, The Journal of the Acoustical Society of America.

[9] D. W. Boston. Synthetic Facial Communication , 1973 .

[10] Daniel Jones. An outline of English phonetics , 1956 .

[11] N. P. Erber. Auditory-visual perception of speech. , 1975, The Journal of speech and hearing disorders.

[12] Allen A. Montgomery. Development of a model for generating synthetic animated lip shapes , 1980 .

[13] P. Mermelstein. Articulatory model for the study of speech production. , 1973, The Journal of the Acoustical Society of America.

[14] O. Fujimura. Bilabial stop and nasal consonants: a motion picture study and its acoustical implications. , 1961, Journal of speech and hearing research.