Vae/Wgan-Based Image Representation Learning For Pose-Preserving Seamless Identity Replacement In Facial Images

We present a novel variational generative adversarial network (VGAN) based on Wasserstein loss to learn a latent representation from a face image that is invariant to identity but preserves head-pose information. This facilitates synthesis of a realistic face image with the same head pose as a given input image, but with a different identity. One application of this network is in privacy-sensitive scenarios; after identity replacement in an image, utility, such as head pose, can still be recovered. Extensive experimental validation on synthetic and real human-face image datasets performed under 3 threat scenarios confirms the ability of the proposed network to preserve head pose of the input image, mask the input identity, and synthesize a good-quality realistic face image of a desired identity. We also show that our network can be used to perform pose-preserving identity morphing and identity-preserving pose morphing. The proposed method improves over a recent state-of-the-art method in terms of quantitative metrics as well as synthesized image quality.

[1]  Janusz Konrad,et al.  Privacy-preserving, indoor occupant localization using a network of single-pixel sensors , 2016, 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[2]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[3]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[4]  Janusz Konrad,et al.  VGAN-Based Image Representation Learning for Privacy-Preserving Facial Expression Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[5]  Shaogang Gong,et al.  Face distributions in similarity space under varying head pose , 2001, Image Vis. Comput..

[6]  Mohan M. Trivedi,et al.  Toward Privacy-Protecting Safety Systems for Naturalistic Driving Videos , 2014, IEEE Transactions on Intelligent Transportation Systems.

[7]  Janusz Konrad,et al.  Privacy-preserving indoor localization via light transport analysis , 2017, IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Amos J. Storkey,et al.  Censoring Representations with an Adversary , 2015, ICLR.

[9]  Bernhard Rinner,et al.  Privacy protection vs. utility in visual data , 2016, Multimedia Tools and Applications.

[10]  Janusz Konrad,et al.  Semi-Coupled Two-Stream Fusion ConvNets for Action Recognition at Extremely Low Resolutions , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[11]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Shaogang Gong,et al.  Understanding Pose Discrimination in Similarity Space , 1999, BMVC.

[14]  Rafael Cabeza,et al.  A novel 2D/3D database with automatic face annotation for head tracking and pose estimation , 2016, Comput. Vis. Image Underst..

[15]  Janusz Konrad,et al.  Estimating head pose orientation using extremely low resolution images , 2016, 2016 IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI).

[16]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[17]  Guillaume Lample,et al.  Fader Networks: Manipulating Images by Sliding Attributes , 2017, NIPS.

[18]  Rafael Cabeza,et al.  Improved Strategies for HPE Employing Learning-by-Synthesis Approaches , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[19]  Graham Neubig,et al.  Controllable Invariance through Adversarial Feature Learning , 2017, NIPS.