PIE

Editing of portrait images is a very popular and important research topic with a large variety of applications. For ease of use, control should be provided via a semantically meaningful parameterization that is akin to computer animation controls. The vast majority of existing techniques do not provide such intuitive and fine-grained control, or only enable coarse editing of a single isolated control parameter. Very recently, high-quality semantically controlled editing has been demonstrated, however only on synthetically created StyleGAN images. We present the first approach for embedding real portrait images in the latent space of StyleGAN, which allows for intuitive editing of the head pose, facial expression, and scene illumination in the image. Semantic editing in parameter space is achieved based on StyleRig, a pretrained neural network that maps the control space of a 3D morphable face model to the latent space of the GAN. We design a novel hierarchical non-linear optimization problem to obtain the embedding. An identity preservation energy term allows spatially coherent edits while maintaining facial integrity. Our approach runs at interactive frame rates and thus allows the user to explore the space of possible edits. We evaluate our approach on a wide set of portrait photos, compare it to the current state of the art, and validate the effectiveness of its components in an ablation study.

[1]  Frédo Durand,et al.  Style transfer for headshot portraits , 2014, ACM Trans. Graph..

[2]  Justus Thies,et al.  Deferred neural rendering , 2019, ACM Trans. Graph..

[3]  Bolei Zhou,et al.  In-Domain GAN Inversion for Real Image Editing , 2020, ECCV.

[4]  Kun Zhou,et al.  Warp-guided GANs for single-photo facial animation , 2018, ACM Trans. Graph..

[5]  Sylvain Paris,et al.  Portrait lighting transfer using a mass transport approach , 2017, TOGS.

[6]  Jiaya Jia,et al.  Deep Automatic Portrait Matting , 2016, ECCV.

[7]  Hao Li,et al.  paGAN , 2018, Keywords of Identity, Race, and Human Mobility in Early Modern England.

[8]  Paul Debevec,et al.  Deep reflectance fields , 2019, ACM Trans. Graph..

[9]  Yaser Sheikh,et al.  Recycle-GAN: Unsupervised Video Retargeting , 2018, ECCV.

[10]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[11]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[12]  Hans-Peter Seidel,et al.  Neural style-preserving visual dubbing , 2019, ACM Trans. Graph..

[13]  Pieter Peers,et al.  Post-production facial performance relighting using reflectance transfer , 2007, SIGGRAPH 2007.

[14]  Yiying Tong,et al.  FaceWarehouse: A 3D Facial Expression Database for Visual Computing , 2014, IEEE Transactions on Visualization and Computer Graphics.

[15]  Patrick Pérez,et al.  Deep video portraits , 2018, ACM Trans. Graph..