Learning to Generate 3D Stylized Character Expressions from Humans

We present ExprGen, a system to automatically generate 3D stylized character expressions from humans in a perceptually valid and geometrically consistent manner. Our multi-stage deep learning system utilizes the latent variables of human and character expression recognition convolutional neural networks to control a 3D animated character rig. This end-to-end system takes images of human faces and generates the character rig parameters that best match the human's facial expression. ExprGen generalizes to multiple characters, and allows expression transfer between characters in a semi-supervised manner. Qualitative and quantitative evaluation of our method based on Mechanical Turk tests show the high perceptual accuracy of our expression transfer results.

[1]  D. Lundqvist,et al.  Karolinska Directed Emotional Faces , 2015 .

[2]  Ming-Wei Huang,et al.  Facial Expression Recognition Based on Fusion of Sparse Representation , 2010, ICIC.

[3]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[4]  Tamás D. Gedeon,et al.  Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[5]  Luiz Eduardo Soares de Oliveira,et al.  Fusion of feature sets and classifiers for facial expression recognition , 2013, Expert Syst. Appl..

[6]  Wilmot Li,et al.  Customized expression recognition for performance-driven cutout character animation , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[7]  Clément Farabet,et al.  Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[8]  Yeongho Seol,et al.  Artist friendly facial animation retargeting , 2011, ACM Trans. Graph..

[9]  Jun-yong Noh,et al.  Expression cloning , 2001, SIGGRAPH 2001.

[10]  Paul A. Bromiley,et al.  Robust and Accurate Shape Model Matching Using Random Forest Regression-Voting , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Nikos Komodakis,et al.  Learning to compare image patches via convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Yangang Wang,et al.  Online modeling for realtime facial animation , 2013, ACM Trans. Graph..

[13]  Alex Pentland,et al.  Modeling, tracking and interactive animation of faces and heads//using input from video , 1996, Proceedings Computer Animation '96.

[14]  Simon Baker,et al.  2D vs. 3D Deformable Face Models: Representational Power, Construction, and Real-Time Fitting , 2007, International Journal of Computer Vision.

[15]  Linda G. Shapiro,et al.  Modeling Stylized Character Expressions via Deep Learning , 2016, ACCV.

[16]  Simon Lucey,et al.  Real-time avatar animation from a single image , 2011, Face and Gesture 2011.

[17]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Erika Chuang,et al.  Performance Driven Facial Animation using Blendshape Interpolation , 2002 .

[19]  Chris Van Allsburg The Polar Express , 2004, SIGGRAPH Computer Animation Festival.

[20]  Catherine Pelachaud,et al.  Expressive Body Animation Pipeline for Virtual Agent , 2012, IVA.

[21]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  P. Ekman An argument for basic emotions , 1992 .

[23]  Xiaoyang Tan,et al.  Fusing Gabor and LBP Feature Sets for Kernel-Based Face Recognition , 2007, AMFG.

[24]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[25]  Jing Xiao,et al.  Vision-based control of 3D facial animation , 2003, SCA '03.

[26]  Maja Pantic,et al.  Web-based database for facial expression analysis , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[27]  Junmo Kim,et al.  Joint Fine-Tuning in Deep Neural Networks for Facial Expression Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Christopher Joseph Pal,et al.  EmoNets: Multimodal deep learning approaches for emotion recognition in video , 2015, Journal on Multimodal User Interfaces.

[29]  Rama Chellappa,et al.  FaceNet2ExpNet: Regularizing a Deep Face Recognition Net for Expression Recognition , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[30]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[31]  Ping Liu,et al.  Facial Expression Recognition via a Boosted Deep Belief Network , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Marcus Liwicki,et al.  DeXpression: Deep Convolutional Neural Network for Expression Recognition , 2015, ArXiv.

[33]  Frank E. Pollick,et al.  In Search of the Uncanny Valley , 2009, UCMedia.

[34]  Mohammad H. Mahoor,et al.  DISFA: A Spontaneous Facial Action Intensity Database , 2013, IEEE Transactions on Affective Computing.

[35]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Mohammad H. Mahoor,et al.  Going deeper in facial expression recognition using deep neural networks , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).