Emotional facial expression transfer from a single image via generative adversarial nets

Facial expression transfer from a single image is a challenging task and has drawn sustained attention in the fields of computer vision and computer graphics. Recently, generative adversarial nets (GANs) have provided a new approach to facial expression transfer from a single image toward target facial expressions. However, it is still difficult to obtain a sequence of smoothly changed facial expressions. We present a novel GAN‐based method for generating emotional facial expression animations given a single image and several facial landmarks for the in‐between stages. In particular, landmarks of other subjects are incorporated into a GAN model to control the generated facial expression from a latent space. With the trained model, high‐quality face images and a smoothly changed facial expression sequence can be effectively obtained, which are showed qualitatively and quantitatively in our experiments on the Multi‐PIE and CK+ data sets.

[1]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[2]  Yong Yu,et al.  Facial animation by optimized blendshapes from motion capture data , 2008, Comput. Animat. Virtual Worlds.

[3]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[4]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Geoffrey E. Hinton,et al.  Generating Facial Expressions with Deep Belief Nets , 2008 .

[6]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[7]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[8]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[9]  Yang Song,et al.  Age Progression/Regression by Conditional Adversarial Autoencoder , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[11]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[12]  Yuting Zhang,et al.  Learning to Disentangle Factors of Variation with Manifold Interaction , 2014, ICML.

[13]  Yi Chen,et al.  Wasserstein blue noise sampling , 2017, TOGS.

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Ersin Yumer,et al.  Neural Face Editing with Intrinsic Image Disentangling , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Rama Chellappa,et al.  ExprGAN: Facial Expression Editing with Controllable Expression Intensity , 2017, AAAI.

[17]  Bertram E. Shi,et al.  Photorealistic facial expression synthesis by the conditional difference adversarial autoencoder , 2017, 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII).

[18]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[19]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[20]  Jan Kautz,et al.  Visio-lization: generating novel facial images , 2009, ACM Trans. Graph..

[21]  Yong Man Ro,et al.  Differential Generative Adversarial Networks: Synthesizing Non-linear Facial Variations with Limited Number of Training Data , 2017, ArXiv.

[22]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[23]  Tieniu Tan,et al.  Geometry Guided Adversarial Facial Expression Synthesis , 2017, ACM Multimedia.

[24]  Tomaso A. Poggio,et al.  Reanimating Faces in Images and Video , 2003, Comput. Graph. Forum.

[25]  Ole Winther,et al.  Autoencoding beyond pixels using a learned similarity metric , 2015, ICML.

[26]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[28]  Kun Zhou,et al.  Displaced dynamic expression regression for real-time facial tracking and animation , 2014, ACM Trans. Graph..

[29]  Patrick Pérez,et al.  Automatic Face Reenactment , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Daniel Cohen-Or,et al.  Bringing portraits to life , 2017, ACM Trans. Graph..

[31]  Steve Marschner,et al.  Matching Real Fabrics with Micro-Appearance Models , 2015, ACM Trans. Graph..

[32]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[33]  Deepak Ghimire,et al.  Geometric Feature-Based Facial Expression Recognition in Image Sequences Using Multi-Class AdaBoost and Support Vector Machines , 2013, Sensors.

[34]  Junmo Kim,et al.  Joint Fine-Tuning in Deep Neural Networks for Facial Expression Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[35]  Adam Finkelstein,et al.  Perspective-aware manipulation of portrait photos , 2016, ACM Trans. Graph..