论文信息 - Audio driven facial animation for audio-visual reality

Audio driven facial animation for audio-visual reality

In this paper, we demonstrate a morphing based automated audio driven facial animation system. Based on an incoming audio stream, a face image is animated with full lip synchronization and expression. An animation sequence using optical flow between visemes is constructed, given an incoming audio stream and still pictures of a face speaking different visemes. Rules are formulated based on coarticulation and the duration of a viseme to control the continuity in terms of shape and extent of lip opening. In addition to this new viseme-expression combinations are synthesized to be able to generate animations with new facial expressions. Finally various applications of this system are discussed in the context of creating audio-visual reality.

[1] Hans Peter Graf,et al. Photo-Realistic Talking-Heads from Image Samples , 2000, IEEE Trans. Multim..

[2] Alex Pentland,et al. Coding, Analysis, Interpretation, and Recognition of Facial Expressions , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[3] Michael J. Black,et al. Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion , 1995, Proceedings of IEEE International Conference on Computer Vision.

[4] Keith Waters,et al. Computer facial animation , 1996 .

[5] Gary S. Katz,et al. Bimodal expression of emotion by face and voice , 1998, MULTIMEDIA '98.

[6] John Yen,et al. Emotionally expressive agents , 1999, Proceedings Computer Animation 1999.

[7] D. Massaro. Perceiving talking faces: from speech perception to a behavioral principle , 1999 .

[8] Tony Ezzat,et al. MikeTalk: a talking facial display based on morphing visemes , 1998, Proceedings Computer Animation '98 (Cat. No.98EX169).

[9] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[10] Chalapathy Neti,et al. Translingual visual speech synthesis , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[11] Thomas S. Huang,et al. Estimating three-dimensional motion parameters of a rigid planar patch , 1981 .

[12] Keith Waters,et al. Computer Facial Animation, Second Edition , 1996 .

[13] V C Tartter,et al. Hearing smiles and frowns in normal and whisper registers. , 1994, The Journal of the Acoustical Society of America.