Recent Advances of Generative Adversarial Networks in Computer Vision

The appearance of generative adversarial networks (GAN) provides a new approach and framework for computer vision. Compared with traditional machine learning algorithms, GAN works via adversarial training concept and is more powerful in both feature learning and representation. GAN also exhibits some problems, such as non-convergence, model collapse, and uncontrollability due to high degree of freedom. How to improve the theory of GAN and apply it to computer-vision-related tasks have now attracted much research efforts. In this paper, recently proposed GAN models and their applications in computer vision are systematically reviewed. In particular, we firstly survey the history and development of generative algorithms, the mechanism of GAN, its fundamental network structures, and theoretical analysis of the original GAN. Classical GAN algorithms are then compared comprehensively in terms of the mechanism, visual results of generated samples, and Frechet Inception Distance. These networks are further evaluated from network construction, performance, and applicability aspects by extensive experiments conducted over public datasets. After that, several typical applications of GAN in computer vision, including high-quality samples generation, style transfer, and image translation, are examined. Finally, some existing problems of GAN are summarized and discussed and potential future research topics are forecasted.

[1]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[2]  Geoffrey E. Hinton,et al.  Transforming Autoencoders , 2011 .

[3]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[4]  Jürgen Schmidhuber,et al.  Multi-dimensional Recurrent Neural Networks , 2007, ICANN.

[5]  Nanfeng Xiao,et al.  Improved Boundary Equilibrium Generative Adversarial Networks , 2018, IEEE Access.

[6]  Lantao Yu,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[7]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[8]  Di He,et al.  A Game-Theoretic Machine Learning Approach for Revenue Maximization in Sponsored Search , 2013, IJCAI.

[9]  Fei-Yue Wang,et al.  Generative adversarial networks: introduction and outlook , 2017, IEEE/CAA Journal of Automatica Sinica.

[10]  Ersin Yumer,et al.  Transformation-Grounded Image Generation Network for Novel 3D View Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Yuexiang Li,et al.  cC-GAN: A Robust Transfer-Learning Framework for HEp-2 Specimen Image Segmentation , 2018, IEEE Access.

[13]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[14]  Gunnar Rätsch,et al.  Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs , 2017, ArXiv.

[15]  Antonio Torralba,et al.  Generating Videos with Scene Dynamics , 2016, NIPS.

[16]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[17]  A. Cherian,et al.  Sem-GAN: Semantically-Consistent Image-to-Image Translation , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[18]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[19]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[20]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[21]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[22]  Xia Li,et al.  Road Detection From Remote Sensing Images by Generative Adversarial Networks , 2018, IEEE Access.

[23]  Aykut Erdem,et al.  Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts , 2016, ArXiv.

[24]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[26]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[27]  Raeid Saqur,et al.  CapsGAN: Using Dynamic Routing for Generative Adversarial Networks , 2018, ArXiv.

[28]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[29]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Yunchao Wei,et al.  Perceptual Generative Adversarial Networks for Small Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[33]  Mario Lucic,et al.  Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.

[34]  Jimeng Sun,et al.  Generating Multi-label Discrete Electronic Health Records using Generative Adversarial Networks , 2017, ArXiv.

[35]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[36]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Andrew Zisserman,et al.  Get Out of my Picture! Internet-based Inpainting , 2009, BMVC.

[38]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[39]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[40]  Yi-Hsuan Yang,et al.  MidiNet: A Convolutional Generative Adversarial Network for Symbolic-Domain Music Generation , 2017, ISMIR.

[41]  Ankit Patel,et al.  JR-GAN: Jacobian Regularization for Generative Adversarial Networks , 2018, ArXiv.

[42]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[43]  Heng Tao Shen,et al.  Video Captioning by Adversarial LSTM , 2018, IEEE Transactions on Image Processing.

[44]  Zhe Gan,et al.  Variational Autoencoder for Deep Learning of Images, Labels and Captions , 2016, NIPS.

[45]  Yongbin Liu,et al.  Generative Adversarial Networks with Decoder-Encoder Output Noise , 2018, ArXiv.

[46]  Tom White,et al.  Generative Adversarial Networks: An Overview , 2017, IEEE Signal Processing Magazine.

[47]  Yao Sun,et al.  Face Aging with Contextual Generative Adversarial Nets , 2017, ACM Multimedia.

[48]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[49]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[50]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[52]  Hyun Myung,et al.  Autoencoder-Combined Generative Adversarial Networks for Synthetic Image Data Generation and Detection of Jellyfish Swarm , 2018, IEEE Access.

[53]  Jonas Adler,et al.  Banach Wasserstein GAN , 2018, NeurIPS.

[54]  Jan Kautz,et al.  MoCoGAN: Decomposing Motion and Content for Video Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[56]  Ziqiang Zheng,et al.  Instance Map Based Image Synthesis With a Denoising Generative Adversarial Network , 2018, IEEE Access.

[57]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  Hyunsoo Kim,et al.  Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.

[59]  Ravi Kiran Sarvadevabhatla,et al.  DeLiGAN: Generative Adversarial Networks for Diverse and Limited Data , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[61]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[62]  Chuang Gan,et al.  Recurrent Topic-Transition GAN for Visual Paragraph Generation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[63]  Graham W. Taylor,et al.  Adaptive deconvolutional networks for mid and high level feature learning , 2011, 2011 International Conference on Computer Vision.

[64]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Tianqi Chen,et al.  Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.

[66]  Naoto Yokoya,et al.  IMG2DSM: Height Simulation From Single Imagery Using Conditional Generative Adversarial Net , 2018, IEEE Geoscience and Remote Sensing Letters.

[67]  Bernt Schiele,et al.  Generative Adversarial Text to Image Synthesis , 2016, ICML.

[68]  Gözde B. Ünal,et al.  Patch-Based Image Inpainting with Generative Adversarial Networks , 2018, ArXiv.

[69]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[70]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[71]  Lior Wolf,et al.  Unsupervised Cross-Domain Image Generation , 2016, ICLR.

[72]  Na Li,et al.  The Synthesis of Unpaired Underwater Images Using a Multistyle Generative Adversarial Network , 2018, IEEE Access.

[73]  Fuzhou Gong,et al.  Generate the corresponding Image from Text Description using Modified GAN-CLS Algorithm , 2018, ArXiv.

[74]  Alexei A. Efros,et al.  Image quilting for texture synthesis and transfer , 2001, SIGGRAPH.

[75]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[76]  Jacob Abernethy,et al.  On Convergence and Stability of GANs , 2018 .

[77]  Tao Xu,et al.  SegAN: Adversarial Network with Multi-scale L1 Loss for Medical Image Segmentation , 2017, Neuroinformatics.

[78]  Ran He,et al.  Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[79]  Dan Hu,et al.  Semisupervised Hyperspectral Image Classification Based on Generative Adversarial Networks , 2018, IEEE Geoscience and Remote Sensing Letters.

[80]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[81]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[82]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[83]  Olof Mogren,et al.  C-RNN-GAN: Continuous recurrent neural networks with adversarial training , 2016, ArXiv.

[84]  Eric P. Xing,et al.  Semantic-aware Grad-GAN for Virtual-to-Real Urban Scene Adaption , 2018, BMVC.

[85]  Jost Tobias Springenberg,et al.  Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks , 2015, ICLR.

[86]  Bin Li,et al.  A Novel Image Steganography Method via Deep Convolutional Generative Adversarial Networks , 2018, IEEE Access.

[87]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[88]  Dacheng Tao,et al.  Perceptual Adversarial Networks for Image-to-Image Transformation , 2017, IEEE Transactions on Image Processing.

[89]  Ian J. Goodfellow,et al.  NIPS 2016 Tutorial: Generative Adversarial Networks , 2016, ArXiv.

[90]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[91]  Sung Yong Shin,et al.  On pixel-based texture synthesis by non-parametric sampling , 2006, Comput. Graph..

[92]  Ming-Hsuan Yang,et al.  Generative Face Completion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[93]  Zhetao Li,et al.  Generative Adversarial Networks for Change Detection in Multispectral Imagery , 2017, IEEE Geoscience and Remote Sensing Letters.