FaceTuneGAN: Face Autoencoder for Convolutional Expression Transfer Using Neural Generative Adversarial Networks

In this paper, we present FaceTuneGAN, a new 3D face model representation decomposing and encoding separately facial identity and facial expression. We propose a first adaptation of image-toimage translation networks, that have successfully been used in the 2D domain, to 3D face geometry. Leveraging recently released large face scan databases, a neural network has been trained to decouple factors of variations with a better knowledge of the face, enabling facial expressions transfer and neutralization of expressive faces. Specifically, we design an adversarial architecture adapting the base architecture of FUNIT and using SpiralNet++ for our convolutional and sampling operations. Using two publicly available datasets (FaceScape and CoMA), FaceTuneGAN has a better identity decomposition and face neutralization than state-of-the-art techniques. It also outperforms classical deformation transfer approach by predicting blendshapes closer to ground-truth data and with less of undesired artifacts due to too different facial morphologies between source and target.

[1]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[2]  Michael J. Black,et al.  Learning a model of facial shape and expression from 4D scans , 2017, ACM Trans. Graph..

[3]  Stefanos Zafeiriou,et al.  Learning the Multilinear Structure of Visual Data , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Djamila Aouada,et al.  Disentangled Face Identity Representations for joint 3D Face Recognition and Expression Neutralisation , 2021, ArXiv.

[5]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Sebastian Nowozin,et al.  Which Training Methods for GANs do actually Converge? , 2018, ICML.

[7]  Marc Alexa,et al.  Laplacian mesh optimization , 2006, GRAPHITE '06.

[8]  Karthik Ramani,et al.  Deep Learning 3D Shape Surfaces Using Geometry Images , 2016, ECCV.

[9]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Jung-Woo Ha,et al.  StarGAN v2: Diverse Image Synthesis for Multiple Domains , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[13]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[15]  Alexei A. Efros,et al.  Learning Dense Correspondence via 3D-Guided Cycle Consistency , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Yaser Sheikh,et al.  VR facial animation via multiview image translation , 2019, ACM Trans. Graph..

[17]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Chuan Li,et al.  Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks , 2016, ECCV.

[19]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Yann LeCun,et al.  Spectral Networks and Deep Locally Connected Networks on Graphs , 2014 .

[21]  Stefanos Zafeiriou,et al.  SpiralNet++: A Fast and Highly Efficient Mesh Convolution Operator , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Hanspeter Pfister,et al.  Face transfer with multilinear models , 2005, ACM Trans. Graph..

[24]  Mark Meyer,et al.  Implicit fairing of irregular meshes using diffusion and curvature flow , 1999, SIGGRAPH.

[25]  Ali Kashif Bashir,et al.  Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , 2013, ICIRA 2013.

[26]  Stefanos Zafeiriou,et al.  3DFaceGAN: Adversarial Nets for 3D Face Representation, Generation, and Translation , 2019, International Journal of Computer Vision.

[27]  Wan-Chun Ma,et al.  The Digital Emily Project: Achieving a Photorealistic Digital Actor , 2010, IEEE Computer Graphics and Applications.

[28]  Lior Wolf,et al.  Unsupervised Cross-Domain Image Generation , 2016, ICLR.

[29]  Fabien Danieau,et al.  Automatic Generation and Stylization of 3D Facial Rigs , 2019, 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR).

[30]  Roland Siegwart,et al.  3DSNet: Unsupervised Shape-to-Shape 3D Style Transfer , 2020, ArXiv.

[31]  Hao Li,et al.  Learning Formation of Physically-Based Face Attributes , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Lisa M. DeBruine,et al.  The many faces of research on face perception , 2011, Philosophical Transactions of the Royal Society B: Biological Sciences.

[33]  Marcel Campen,et al.  A Simple Approach to Intrinsic Correspondence Learning on Unstructured 3D Meshes , 2018, ECCV Workshops.

[34]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[35]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[36]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[37]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[38]  Le Hui,et al.  Unsupervised Multi-Domain Image Translation with Domain-Specific Encoders/Decoders , 2017, 2018 24th International Conference on Pattern Recognition (ICPR).

[39]  Ferran Argelaguet,et al.  The impact of stylization on face recognition , 2020, SAP.

[40]  BeelerThabo,et al.  3D Morphable Face Models—Past, Present, and Future , 2020 .

[41]  Juyong Zhang,et al.  Disentangled Representation Learning for 3D Face Shape , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Jovan Popovic,et al.  Deformation transfer for triangle meshes , 2004, ACM Trans. Graph..

[43]  Timo Bolkart,et al.  A Robust Multilinear Model Learning Framework for 3D Faces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Wan-Yen Lo,et al.  Accelerating 3D deep learning with PyTorch3D , 2019, SIGGRAPH Asia 2020 Courses.

[45]  Chiarella Sforza,et al.  A New 3-D Tool for Planning Plastic Surgery , 2012, IEEE Transactions on Biomedical Engineering.

[46]  Michael Garland,et al.  Surface simplification using quadric error metrics , 1997, SIGGRAPH.

[47]  M. Gross,et al.  Semantic Deep Face Models , 2020, 2020 International Conference on 3D Vision (3DV).

[48]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[49]  Michael J. Black,et al.  Generating 3D faces using Convolutional Mesh Autoencoders , 2018, ECCV.

[50]  Edmond Boyer,et al.  A Decoupled 3D Facial Shape Model by Adversarial Training , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[51]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[52]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Jian Sun,et al.  Learning Distribution Independent Latent Representation for 3D Face Disentanglement , 2020, 2020 International Conference on 3D Vision (3DV).

[54]  Zhiyuan Zhang,et al.  Understanding and Improving Layer Normalization , 2019, NeurIPS.

[55]  Ruigang Yang,et al.  FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Jaakko Lehtinen,et al.  Few-Shot Unsupervised Image-to-Image Translation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[57]  Harshad Rai,et al.  Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .