论文信息 - Towards video realistic synthetic visual speech

Towards video realistic synthetic visual speech

In this paper we present initial work towards a video-realistic visual speech synthesiser based on statistical models of shape and appearance. A synthesised image sequence corresponding to an utterance is formed by concatenation of synthesis units (in this case phonemes) from a pre-recorded corpus of training data. A smoothing spline is applied to the concatenated parameters to ensure smooth transitions between frames and the resultant parameters applied to the model—early results look promising.

[1] Daniel Thalmann,et al. Models and Techniques in Computer Animation , 2014, Computer Animation Series.

[2] Timothy F. Cootes,et al. Active Appearance Models , 1998, ECCV.

[3] Bertrand Le Goff,et al. A text-to-audiovisual-speech synthesizer for French , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4] Tony Ezzat,et al. MikeTalk: a talking facial display based on morphing visemes , 1998, Proceedings Computer Animation '98 (Cat. No.98EX169).

[5] D. Massaro,et al. Perceiving Talking Faces , 1995 .

[6] Michael M. Cohen,et al. Modeling Coarticulation in Synthetic Visual Speech , 1993 .

[7] Keith Waters,et al. Computer facial animation , 1996 .

[8] Norman I. Badler,et al. Animating facial expressions , 1981, SIGGRAPH '81.

[9] Keith Waters,et al. A muscle model for animation three-dimensional facial expression , 1987, SIGGRAPH.

[10] Tony Ezzat,et al. Videorealistic talking faces: a morphing approach , 1997, AVSP.

[11] Levent M. Arslan,et al. Speech driven 3-d face point trajectory synthesis algorithm , 1998, ICSLP.

[12] Gavin C. Cawley,et al. Towards a low bandwidth talking face using appearance models , 2003, Image Vis. Comput..

[13] Raymond D. Kent,et al. Coarticulation in recent speech production models , 1977 .

[14] C. D. Boor,et al. CALCULATION OF THE SMOOTHING SPLINE WITH WEIGHTED ROUGHNESS MEASURE , 2001 .

[15] Marie-Paule Cani,et al. 3D models of the lips for realistic speech animation , 1996, Proceedings Computer Animation '96.

[16] Christoph Bregler,et al. Video Rewrite: Driving Visual Speech with Audio , 1997, SIGGRAPH.

[17] Zicheng Liu,et al. Rapid modeling of animated faces from video , 2001, Comput. Animat. Virtual Worlds.

[18] Hans Peter Graf,et al. Sample-based synthesis of photo-realistic talking heads , 1998, Proceedings Computer Animation '98 (Cat. No.98EX169).

[19] Bertil Lyberg,et al. Visual Speech Synthesis With Concatenative Speech , 1998, AVSP.

[20] David Salesin,et al. Synthesizing realistic facial expressions from photographs , 1998, SIGGRAPH.

[21] N. Michael Brooke,et al. Two- and Three-Dimensional Audio-Visual Speech Synthesis , 1998, AVSP.

[22] Frederic I. Parke,et al. A parametric model for human faces. , 1974 .