A generic framework for editing and synthesizing multimodal data with relative emotion strength

Emotion is considered to be a core element in performances. In computer animation, both body motions and facial expressions are two popular mediums for a character to express the emotion. However, there has been limited research in studying how to effectively synthesize these two types of character movements using different levels of emotion strength with intuitive control, which is difficult to be modeled effectively. In this work, we explore a common model that can be used to represent the emotion for the applications of body motions and facial expressions synthesis. Unlike previous work that encode emotions into discrete motion style descriptors, we propose a continuous control indicator called emotion strength by controlling which a data‐driven approach is presented to synthesize motions with fine control over emotions. Rather than interpolating motion features to synthesize new motion as in existing work, our method explicitly learns a model mapping low‐level motion features to the emotion strength. Because the motion synthesis model is learned in the training stage, the computation time required for synthesizing motions at run time is very low. We further demonstrate the generality of our proposed framework by editing 2D face images using relative emotion strength. As a result, our method can be applied to interactive applications such as computer games, image editing tools, and virtual reality applications, as well as offline applications such as animation and movie production.

[1]  Kunio Kashino,et al.  Generative Attribute Controller with Conditional Filtered Generative Adversarial Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Pascal Fua,et al.  Style‐Based Motion Synthesis † , 2004, Comput. Graph. Forum.

[3]  F. Pollick,et al.  A motion capture library for the study of identity, gender, and emotion perception from biological motion , 2006, Behavior research methods.

[4]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[5]  Sophie Jörg,et al.  Evaluating the emotional content of human motions on real and virtual characters , 2008, APGV '08.

[6]  Matthias Zwicker,et al.  Faceshop , 2018, ACM Trans. Graph..

[7]  Jessica K. Hodgins,et al.  Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces , 2004, ACM Trans. Graph..

[8]  David A. Forsyth,et al.  Generalizing motion edits with Gaussian processes , 2009, ACM Trans. Graph..

[9]  Jessica K. Hodgins,et al.  A perceptual control space for garment simulation , 2015, ACM Trans. Graph..

[10]  Ludovic Hoyet,et al.  Emotion Capture: Emotionally Expressive Characters for Games , 2013, MIG.

[11]  Yiming Yang,et al.  Preprocessing Time Series Data for Classification with Application to CRM , 2005, Australian Conference on Artificial Intelligence.

[12]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Pong C. Yuen,et al.  Improving posture classification accuracy for depth sensor-based human activity monitoring in smart environments , 2016, Comput. Vis. Image Underst..

[14]  Lucas Kovar,et al.  Motion Graphs , 2002, ACM Trans. Graph..

[15]  Andrew Zisserman,et al.  Learning Visual Attributes , 2007, NIPS.

[16]  Xiaofeng Tao,et al.  Transient attributes for high-level understanding and editing of outdoor scenes , 2014, ACM Trans. Graph..

[17]  Niloy J. Mitra,et al.  Spectral style transfer for human motion between independent actions , 2016, ACM Trans. Graph..

[18]  Markus Raab,et al.  The Building Blocks of Performance: An Overview , 2016 .

[19]  Jovan Popovic,et al.  Style translation for human motion , 2005, ACM Trans. Graph..

[20]  Olivier Chapelle,et al.  Training a Support Vector Machine in the Primal , 2007, Neural Computation.

[21]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Aaron Hertzmann,et al.  Style machines , 2000, SIGGRAPH 2000.

[23]  Baoxin Li,et al.  Predicting Multiple Attributes via Relative Multi-task Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Jitendra Malik,et al.  Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Norman I. Badler,et al.  The effect of posture and dynamics on the perception of emotion , 2013, SAP.

[26]  Rama Chellappa,et al.  ExprGAN: Facial Expression Editing with Controllable Expression Intensity , 2017, AAAI.

[27]  Jessica K. Hodgins,et al.  Realtime style transfer for unlabeled heterogeneous human motion , 2015, ACM Trans. Graph..

[28]  Daniel Thalmann,et al.  Key-posture extraction out of human motion data , 2001, 2001 Conference Proceedings of the 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[29]  Andreas Aristidou,et al.  Motion indexing of different emotional states using LMA components , 2013, SIGGRAPH ASIA Technical Briefs.

[30]  Kristen Grauman,et al.  Relative attributes , 2011, 2011 International Conference on Computer Vision.

[31]  Daniel Cohen-Or,et al.  Bringing portraits to life , 2017, ACM Trans. Graph..