BCGAN: Facial Expression Synthesis by Bottleneck-Layered Conditional Generative Adversarial Networks

Facial expression synthesis is widely applied to emotion prediction and face recognition for human-computer interaction. This task is challenging because it is difficult to reconstruct realistic and accurate facial expressions. Early deep learning methods focus only on pixel-level manipulation and are not suitable for generating realistic facial expressions. In this paper, we propose a bottleneck-layered conditional generative adversarial networks (BCGAN) for more realistic and accurate facial expression synthesis. BCGAN adopts a bottleneck layer that uses channel-wise concatenation in the generator to train with meaningful features only. In addition, a dense connection that links all bottleneck layers is added to generate an image which preserves the facial details of the original image. Both quantitative and qualitative evaluations were performed using the Radboud Faces Database (RaFD). Experimental results showed that BCGAN had 2% higher classification accuracy (98.7%) on the generated images as well as faster training speed compared to state-of-the-art approach.

[1]  Bogdan Raducanu,et al.  Invertible Conditional GANs for image editing , 2016, ArXiv.

[2]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[3]  Skyler T. Hawk,et al.  Presentation and validation of the Radboud Faces Database , 2010 .

[4]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[6]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  David Zhang,et al.  Deep Identity-aware Transfer of Facial Attributes , 2016, ArXiv.

[8]  Shuicheng Yan,et al.  Look Across Elapse: Disentangled Representation Learning and Photorealistic Cross-Age Face Synthesis for Age-Invariant Face Recognition , 2018, AAAI.

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Dustin Tran,et al.  Image Transformer , 2018, ICML.

[11]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Rama Chellappa,et al.  ExprGAN: Facial Expression Editing with Controllable Expression Intensity , 2017, AAAI.