论文信息 - Pixel-based Facial Expression Synthesis

Pixel-based Facial Expression Synthesis

Facial expression synthesis has achieved remarkable advances with the advent of Generative Adversarial Networks (GANs). However, GAN-based approaches mostly generate photo-realistic results as long as the testing data distribution is close to the training data distribution. The quality of GAN results significantly degrades when testing images are from a slightly different distribution. Moreover, recent work has shown that facial expressions can be synthesized by changing localized face regions. In this work, we propose a pixel-based facial expression synthesis method in which each output pixel observes only one input pixel. The proposed method achieves good generalization capability by leveraging only a few hundred training images. Experimental results demonstrate that the proposed method performs comparably well against state-of-the-art GANs on in-dataset images and significantly better on out-of-dataset images. In addition, the proposed model is two orders of magnitude smaller which makes it suitable for deployment on resource-constrained devices.

Nazar Khan | Arbish Akram

[1] Francesc Moreno-Noguer,et al. GANimation: One-Shot Anatomically Consistent Facial Animation , 2019, International Journal of Computer Vision.

[2] Baining Guo,et al. Geometry-driven photorealistic facial expression synthesis , 2003, IEEE Transactions on Visualization and Computer Graphics.

[3] Timo Aila,et al. A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[5] Nazar Khan,et al. Machine Learning at the Network Edge: A Survey , 2019, ACM Comput. Surv..

[6] Matthew Turk,et al. A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[7] Timothy F. Cootes,et al. Feature Detection and Tracking with Constrained Local Models , 2006, BMVC.

[8] Timothy F. Cootes,et al. Active Shape Models - 'smart snakes' , 1992, BMVC.

[9] Hyunsoo Kim,et al. Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.

[10] Kumar Krishna Agrawal,et al. GANSynth: Adversarial Neural Audio Synthesis , 2019, ICLR.

[11] 拓海杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[12] Jung-Woo Ha,et al. StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13] Arman Savran,et al. Bosphorus Database for 3D Face Analysis , 2008, BIOID.

[14] Fernando De la Torre,et al. Bilinear Kernel Reduced Rank Regression for Facial Expression Synthesis , 2010, ECCV.

[15] Yang Song,et al. Age Progression/Regression by Conditional Adversarial Autoencoder , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Soraia Raupp Musse,et al. A framework for generic facial expression transfer , 2017, Entertain. Comput..

[17] Yingtao Tian,et al. Towards the Automatic Anime Characters Creation with Generative Adversarial Networks , 2017, ArXiv.

[18] Yong Tao,et al. Compound facial expressions of emotion , 2014, Proceedings of the National Academy of Sciences.

[19] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Dimitris N. Metaxas,et al. StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[21] Nazar Khan,et al. Masked Linear Regression for Learning Local Receptive Fields for Facial Expression Synthesis , 2019, International Journal of Computer Vision.

[22] Guoying Zhao,et al. Face Hallucination via Coarse-to-Fine Recursive Kernel Regression Structure , 2019, IEEE Transactions on Multimedia.

[23] Xinbo Gao,et al. Fast and Accurate Single Image Super-Resolution via Information Distillation Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24] Christian Ledig,et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Wei Shen,et al. Learning Residual Images for Face Attribute Manipulation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Aleix M. Martínez,et al. EmotioNet: An Accurate, Real-Time Algorithm for the Automatic Annotation of a Million Facial Expressions in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.

[28] Alexei A. Efros,et al. Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[29] Jaakko Lehtinen,et al. Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Jiajun Wu,et al. Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[31] Bogdan Raducanu,et al. Invertible Conditional GANs for image editing , 2016, ArXiv.

[32] Skyler T. Hawk,et al. Presentation and validation of the Radboud Faces Database , 2010 .

[33] Jaakko Lehtinen,et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[34] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[35] Jan Kautz,et al. Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[36] Timothy F. Cootes,et al. Active Appearance Models , 1998, ECCV.