论文信息 - Converting text input into moving-face picture

Converting text input into moving-face picture

A moving picture of a face with mouth-shape variations corresponding to a text sentence input is produced. The input sentence is divided into a train of phonemes and a speech synthesis technique capable of outputting a voice feature of each phoneme and its duration is utilized. Based on the voice feature, a mouth-shape feature corresponding to each phoneme is determined 3. Based on the mouth-shape feature, the value of a mouth-shape parameter is determined 5, 4 for representing a mouth shape. Further, the value of the mouth-shape parameter for each frame of the moving picture is controlled 2 in accordance with the duration of each phoneme, thereby synthesizing the moving face picture having mouth-shape variations which agree with the speech output.

Masahide Kaneko | Yoshinori Hatori | Atsushi Koike | Seiichi Yamamoto | Norio Higuchi