Controllable Multi-Attribute Editing of High-Resolution Face Images

In recent years, significant progress has been achieved in face image editing due to the success of Generative Adversarial Network (GAN). However, state-of-the-art face editing methods mainly suffer from the following two limitations: 1) they are only applicable to face images with relative low-resolutions and 2) multi-attribute face editing may generate uncontrollable changes in non-target face attribute categories. To solve these problems, we propose a novel High-Quality Generative Adversarial Network (HQ-GAN) for controllable editing of multiple face attributes in high-resolution images. HQ-GAN has two novel ideas to break the limitations of resolution and controllability correspondingly: 1) fine-grained textures and realistic details of high-resolution face images are better preserved with the aid of textural features extracted by the wavelet transform module and 2) desired multi-attribute targets of face editing are emphasized using a weighted binary cross-entropy (BCE) loss so that the influence on non-target attributes is greatly reduced. To the best of our knowledge, HQ-GAN is the first attempt to achieve continuous editing of multiple face attributes on high-resolution images of the CelebA-HQ using only 28 000 training samples. Extensive qualitative results demonstrate the superiority of the proposed method in rendering realistic high-resolution face images with accurate attribute modification, and comprehensive quantitative results show that the proposed method significantly outperforms state-of-the-art face editing methods.

[1]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[2]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[3]  Fang Zhao,et al.  Towards Pose Invariant Face Recognition in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Zhenan Sun,et al.  3D Aided Duet GANs for Multi-View Face Image Synthesis , 2019, IEEE Transactions on Information Forensics and Security.

[5]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[6]  Tieniu Tan,et al.  Wavelet-SRNet: A Wavelet-Based CNN for Multi-scale Face Super Resolution , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9]  Xiao Liu,et al.  STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  O. Kumar,et al.  Discrete Wavelet Transform-Based Satellite Image Resolution Enhancement , 2013 .

[11]  Vladlen Koltun,et al.  Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  W. Marsden I and J , 2012 .

[13]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[14]  Shuicheng Yan,et al.  Look Across Elapse: Disentangled Representation Learning and Photorealistic Cross-Age Face Synthesis for Age-Invariant Face Recognition , 2018, AAAI.

[15]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Fang Zhao,et al.  Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis , 2017, NIPS.

[17]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[18]  Chi-Keung Tang,et al.  Attribute-Guided Face Generation Using Conditional CycleGAN , 2017, ECCV.

[19]  Bogdan Raducanu,et al.  Invertible Conditional GANs for image editing , 2016, ArXiv.

[20]  Shiguang Shan,et al.  Generative Adversarial Network with Spatial Attention for Face Attribute Editing , 2018, ECCV.

[21]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[22]  Rama Chellappa,et al.  ExprGAN: Facial Expression Editing with Controllable Expression Intensity , 2017, AAAI.

[23]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[24]  Guillaume Lample,et al.  Fader Networks: Manipulating Images by Sliding Attributes , 2017, NIPS.

[25]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[26]  Edward Y. Chang,et al.  RelGAN: Multi-Domain Image-to-Image Translation via Relative Attributes , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Wei Shen,et al.  Learning Residual Images for Face Attribute Manipulation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Bo Zhao,et al.  Modular Generative Adversarial Networks , 2018, ECCV.

[29]  Francesc Moreno-Noguer,et al.  GANimation: Anatomically-aware Facial Animation from a Single Image , 2018, ECCV.

[30]  Youngjoo Jo,et al.  SC-FEGAN: Face Editing Generative Adversarial Network With User’s Sketch and Color , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Fang Zhao,et al.  Marginalized CNN: Learning Deep Invariant Representations , 2017, BMVC.

[32]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[33]  Shuicheng Yan,et al.  Towards Age-Invariant Face Recognition , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Shuicheng Yan,et al.  3D-Aided Dual-Agent GANs for Unconstrained Face Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Songhua Xu,et al.  Sparsely Grouped Multi-Task Generative Adversarial Networks for Facial Attribute Manipulation , 2018, ACM Multimedia.

[36]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Shiguang Shan,et al.  AttGAN: Facial Attribute Editing by Only Changing What You Want , 2017, IEEE Transactions on Image Processing.

[38]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Stéphane Mallat,et al.  Wavelets for a vision , 1996, Proc. IEEE.

[40]  Jan Kautz,et al.  Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[41]  Yu Cheng,et al.  3D-Aided Deep Pose-Invariant Face Recognition , 2018, IJCAI.

[42]  Adam Finkelstein,et al.  PairedCycleGAN: Asymmetric Style Transfer for Applying and Removing Makeup , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Jean-Luc Dugelay,et al.  Face aging with conditional generative adversarial networks , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[44]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[45]  Zhenan Sun,et al.  Attribute-Aware Face Aging With Wavelet-Based Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.