GuidedStyle: Attribute Knowledge Guided Style Manipulation for Semantic Face Editing

Although significant progress has been made in synthesizing high-quality and visually realistic face images by unconditional Generative Adversarial Networks (GANs), there still lacks of control over the generation process in order to achieve semantic face editing. In addition, it remains very challenging to maintain other face information untouched while editing the target attributes. In this paper, we propose a novel learning framework, called GuidedStyle, to achieve semantic face editing on StyleGAN by guiding the image generation process with a knowledge network. Furthermore, we allow an attention mechanism in StyleGAN generator to adaptively select a single layer for style manipulation. As a result, our method is able to perform disentangled and controllable edits along various attributes, including smiling, eyeglasses, gender, mustache and hair color. Both qualitative and quantitative results demonstrate the superiority of our method over other competing methods for semantic face editing. Moreover, we show that our model can be also applied to different types of real and artistic face editing, demonstrating strong generalization ability.

[1]  Bolei Zhou,et al.  Closed-Form Factorization of Latent Semantics in GANs , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Julien Rabin,et al.  Wasserstein Barycenter and Its Application to Texture Mixing , 2011, SSVM.

[3]  Linlin Shen,et al.  SwitchGAN for Multi-domain Facial Image Translation , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[4]  Anil A. Bharath,et al.  Inverting the Generator of a Generative Adversarial Network , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[6]  LinLin Shen,et al.  Deep Feature Consistent Variational Autoencoder , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[7]  Guillaume Lample,et al.  Fader Networks: Manipulating Images by Sliding Attributes , 2017, NIPS.

[8]  Bolei Zhou,et al.  Semantic photo manipulation with a generative image prior , 2019, ACM Trans. Graph..

[9]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[10]  Jaakko Lehtinen,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Daniel Cohen-Or,et al.  Designing an encoder for StyleGAN image manipulation , 2021, ACM Trans. Graph..

[12]  Peter Wonka,et al.  StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows , 2020, ArXiv.

[13]  Jing Liao,et al.  CariGANs , 2018, ACM Trans. Graph..

[14]  Ole Winther,et al.  Autoencoding beyond pixels using a learned similarity metric , 2015, ICML.

[15]  Jung-Woo Ha,et al.  StarGAN v2: Diverse Image Synthesis for Multiple Domains , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[17]  Wei Shen,et al.  Learning Residual Images for Face Attribute Manipulation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[19]  Jianmin Bao,et al.  High-Fidelity and Arbitrary Face Editing , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[21]  David Bau,et al.  Rewriting a Deep Generative Model , 2020, ECCV.

[22]  Skyler T. Hawk,et al.  Presentation and validation of the Radboud Faces Database , 2010 .

[23]  Oliver Wang,et al.  MSG-GAN: Multi-Scale Gradients for Generative Adversarial Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[25]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Chen Gao,et al.  PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Deli Zhao,et al.  In-Domain GAN Inversion for Real Image Editing , 2020, ECCV.

[28]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[29]  Tero Karras,et al.  Training Generative Adversarial Networks with Limited Data , 2020, NeurIPS.

[30]  Bolei Zhou,et al.  Interpreting the Latent Space of GANs for Semantic Face Editing , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Bolei Zhou,et al.  Seeing What a GAN Cannot Generate , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32]  Adam Finkelstein,et al.  PairedCycleGAN: Asymmetric Style Transfer for Applying and Removing Makeup , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[34]  LinLin Shen,et al.  Improving variational autoencoder with deep feature consistent and generative adversarial training , 2019, Neurocomputing.

[35]  Dani Lischinski,et al.  StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[37]  Mohamed A. Elgharib,et al.  PIE , 2020 .

[38]  Jaakko Lehtinen,et al.  GANSpace: Discovering Interpretable GAN Controls , 2020, NeurIPS.

[39]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Christian Theobalt,et al.  StyleRig: Rigging StyleGAN for 3D Control Over Portrait Images , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Daniel Cohen-Or,et al.  Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Anil K. Jain,et al.  WarpGAN: Automatic Caricature Generation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Liu Ying,et al.  ClsGAN: Selective Attribute Editing Model based on Classification Adversarial Network , 2021, Neural Networks.

[44]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[45]  Ping Chen,et al.  His-GAN: A histogram-based GAN model to improve data generation quality , 2019, Neural Networks.

[46]  Wei Xiong,et al.  CariGAN: Caricature Generation through Weakly Paired Adversarial Learning , 2018, Neural Networks.

[47]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[48]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Peter Wonka,et al.  Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[50]  Aude Oliva,et al.  GANalyze: Toward Visual Definitions of Cognitive Image Properties , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[51]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[52]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[53]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[54]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[55]  Peter Wonka,et al.  Image2StyleGAN++: How to Edit the Embedded Images? , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[57]  Bolei Zhou,et al.  InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs , 2020, IEEE transactions on pattern analysis and machine intelligence.

[58]  Shiguang Shan,et al.  AttGAN: Facial Attribute Editing by Only Changing What You Want , 2017, IEEE Transactions on Image Processing.

[59]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[60]  Harshad Rai,et al.  Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .

[61]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[62]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[63]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).