Face analysis for the synthesis of photo-realistic talking heads

This paper describes techniques for extracting bitmaps of facial parts from videos of a talking person. The goal is to synthesize photo-realistic talking heads of high quality that show picture-perfect appearance and realistic head movements with good lip-sound synchronization. For the synthesis of a talking head, bitmaps of facial parts are combined to form whole heads and then sequences of such images are integrated with audio from a text-to-speech synthesizer. For a seamless integration of facial parts into an animation, their shape and visual appearance must be known with high accuracy. The recognition system has to find not only the locations of facial features, but must also be able to determine the head's orientation and recognize the facial expressions. Our face recognition proceeds in multiple steps, each with an increased precision. Using motion, color and shape information, the head's position and the location of the main facial features are determined first. Then smaller areas are searched with matched filters, in order to identify specific facial features with high precision. From this information a head's 3D orientation is calculated. Facial parts are cut from the image and, using the head's orientation, are warped into bitmaps with 'normalized' orientation and scale.

[1]  Henrique S. Malvar,et al.  Making Faces , 2019, Topoi.

[2]  Tony Ezzat,et al.  MikeTalk: a talking facial display based on morphing visemes , 1998, Proceedings Computer Animation '98 (Cat. No.98EX169).

[3]  Larry S. Davis,et al.  Iterative Pose Estimation Using Coplanar Feature Points , 1996, Comput. Vis. Image Underst..

[4]  Keith Waters,et al.  Computer facial animation , 1996 .

[5]  S DavisLarry,et al.  Iterative Pose Estimation Using Coplanar Feature Points , 1996 .

[6]  Jörn Ostermann,et al.  Animation of synthetic faces in MPEG-4 , 1998, Proceedings Computer Animation '98 (Cat. No.98EX169).

[7]  Hans Peter Graf,et al.  Sample-based synthesis of photo-realistic talking heads , 1998, Proceedings Computer Animation '98 (Cat. No.98EX169).

[8]  Christoph Bregler,et al.  Video Rewrite: Driving Visual Speech with Audio , 1997, SIGGRAPH.

[9]  John Yen,et al.  Emotionally expressive agents , 1999, Proceedings Computer Animation 1999.