In this demo, we build a practical system, WeCard, to generate personalized multimodal electronic greeting cards based on parametric emotional talking avatar synthesis technologies. Given user-input greeting text and facial image, WeCard intelligently and automatically generate the personalized speech with expressive lip-motion synchronized facial animation. Besides the parametric talking avatar synthesis, WeCard incorporates two key technologies: 1) automatical face mesh generation algorithm based on MPEG-4 FAPs (Facial Animation Parameters) extracted by the face alignment algorithm; 2) emotional audio-visual speech synchronization algorithm based on DBN. More specifically, WeCard merges the users? preferred electronic card scene with emotional talking avatar animation, turning the final content into flash or video file that can be easily shared with friends. By this way, WeCard can help you make your multimodal greetings to be more attractive, beautiful, and sincere.
[1]
Nicu Sebe,et al.
Sonify your face: facial expressions for sound generation
,
2010,
ACM Multimedia.
[2]
Yongxin Wang,et al.
Emotional Audio-Visual Speech Synthesis Based on PAD
,
2011,
IEEE Transactions on Audio, Speech, and Language Processing.
[3]
Lianhong Cai,et al.
Head and facial gestures synthesis using PAD model for an expressive talking avatar
,
2014,
Multimedia Tools and Applications.
[4]
Lianhong Cai,et al.
Modeling the correlation between modality semantics and facial expressions
,
2012,
Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.
[5]
Li Zhang,et al.
Robust face alignment based on local texture classifiers
,
2005,
IEEE International Conference on Image Processing 2005.