A Survey of State-of-the-Art GAN-based Approaches to Image Synthesis

In the past few years, Generative Adversarial Networks (GANs) have received immense attention by researchers in a variety of application domains. This new field of deep learning has been growing rapidly and has provided a way to learn deep representations without extensive use of annotated training data. Their achievements may be used in a variety of applications, including speech synthesis, image and video generation, semantic image editing, and style transfer. Image synthesis is an important component of expert systems and it attracted much attention since the introduction of GANs. However, GANs are known to be difficult to train especially when they try to generate high resolution images. This paper gives a thorough overview of the state-of-the-art GANs-based approaches in four applicable areas of image generation including Text-to-Image-Synthesis, Image-to-Image-Translation, Face Aging, and 3D Image Synthesis. Experimental results show state-of-the-art performance using GANs compared to traditional approaches in the fields of image processing and machine vision.

[1]  Marcus Liwicki,et al.  TAC-GAN - Text Conditioned Auxiliary Classifier Generative Adversarial Network , 2017, ArXiv.

[2]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[3]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[4]  Lin Yang,et al.  Photographic Text-to-Image Synthesis with a Hierarchically-Nested Adversarial Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  H. T. Kung,et al.  Adversarial nets with perceptual losses for text-to-image synthesis , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).

[6]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[7]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Seunghoon Hong,et al.  Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  J. Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[11]  Yike Guo,et al.  Semantic Image Synthesis via Adversarial Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[13]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Anil K. Jain,et al.  Learning Face Age Progression: A Pyramid Architecture of GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Leonidas J. Guibas,et al.  Learning Representations and Generative Models for 3D Point Clouds , 2017, ICML.

[16]  Brendan J. Frey,et al.  Graphical Models for Machine Learning and Digital Communication , 1998 .

[17]  Qi Li,et al.  Global and Local Consistent Age Generative Adversarial Networks , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[18]  Xiaodong Liu,et al.  Language-Based Image Editing with Recurrent Attentive Models , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Carl Doersch,et al.  Tutorial on Variational Autoencoders , 2016, ArXiv.

[20]  Yoshua Bengio,et al.  ChatPainter: Improving Text to Image Generation using Dialogue , 2018, ICLR.

[21]  Zhe Gan,et al.  AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Gang Hua,et al.  Labeled Faces in the Wild: A Survey , 2016 .

[23]  Guillaume Lample,et al.  Fader Networks: Manipulating Images by Sliding Attributes , 2017, NIPS.

[24]  Bernt Schiele,et al.  Learning What and Where to Draw , 2016, NIPS.

[25]  Xu Tang,et al.  Face Aging with Identity-Preserved Conditional Generative Adversarial Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[27]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[29]  Yao Sun,et al.  Face Aging with Contextual Generative Adversarial Nets , 2017, ACM Multimedia.

[30]  Yang Song,et al.  Age Progression/Regression by Conditional Adversarial Autoencoder , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[32]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[33]  Chao Yang,et al.  Shape Inpainting Using 3D Generative Adversarial Network and Recurrent Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[35]  Yitong Li,et al.  Video Generation From Text , 2017, AAAI.

[36]  David Meger,et al.  Improved Adversarial Systems for 3D Object Generation and Reconstruction , 2017, CoRL.

[37]  Zhichao Zhou,et al.  DeepPano: Deep Panoramic Representation for 3-D Shape Recognition , 2015, IEEE Signal Processing Letters.

[38]  Jiajun Wu,et al.  Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Vladlen Koltun,et al.  Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[40]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[41]  Mahadev Satyanarayanan,et al.  OpenFace: A general-purpose face recognition library with mobile applications , 2016 .

[42]  Jiajun Wu,et al.  MarrNet: 3D Shape Reconstruction via 2.5D Sketches , 2017, NIPS.

[43]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[44]  Sanja Fidler,et al.  Be Your Own Prada: Fashion Synthesis with Structural Coherence , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[45]  Heng Tao Shen,et al.  Dual Conditional GANs for Face Aging and Rejuvenation , 2018, IJCAI.

[46]  Zheng-Hua Tan,et al.  Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification , 2017, INTERSPEECH.

[47]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[48]  Jean-Luc Dugelay,et al.  Boosting cross-age face verification via generative age normalization , 2017, 2017 IEEE International Joint Conference on Biometrics (IJCB).