How Generative Adversarial Networks and Their Variants Work

Generative Adversarial Networks (GANs) have received wide attention in the machine learning field for their potential to learn high-dimensional, complex real data distribution. Specifically, they do not rely on any assumptions about the distribution and can generate real-like samples from latent space in a simple manner. This powerful property allows GANs to be applied to various applications such as image synthesis, image attribute editing, image translation, domain adaptation, and other academic fields. In this article, we discuss the details of GANs for those readers who are familiar with, but do not comprehend GANs deeply or who wish to view GANs from various perspectives. In addition, we explain how GANs operates and the fundamental meaning of various objective functions that have been suggested recently. We then focus on how the GAN can be combined with an autoencoder framework. Finally, we enumerate the GAN variants that are applied to various tasks and other fields for those who are interested in exploiting GANs for their research.

[1]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[2]  John E. Hopcroft,et al.  Stacked Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Hariharan Narayanan,et al.  Sample Complexity of Testing the Manifold Hypothesis , 2010, NIPS.

[4]  Yoshua Bengio,et al.  Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Vishnu Naresh Boddeti,et al.  Gang of GANs: Generative Adversarial Networks with Maximum Margin Ranking , 2017, ArXiv.

[7]  Jost Tobias Springenberg,et al.  Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks , 2015, ICLR.

[8]  Sergey Levine,et al.  Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic , 2016, ICLR.

[9]  Xin Guo,et al.  On the optimality of conditional expectation as a Bregman predictor , 2005, IEEE Trans. Inf. Theory.

[10]  Yoshua Bengio,et al.  Maximum-Likelihood Augmented Discrete Generative Adversarial Networks , 2017, ArXiv.

[11]  Evgeny Burnaev,et al.  Steganographic generative adversarial networks , 2017, International Conference on Machine Vision.

[12]  Philip H. S. Torr,et al.  Multi-agent Diverse Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[14]  Lior Wolf,et al.  One-Sided Unsupervised Domain Mapping , 2017, NIPS.

[15]  Alexandros G. Dimakis,et al.  CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training , 2017, ICLR.

[16]  Sungroh Yoon,et al.  Autonomous UAV Navigation with Domain Adaptation , 2017, ArXiv.

[17]  E. B. Wilson,et al.  The Distribution of Chi-Square. , 1931, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[19]  Dimitris N. Metaxas,et al.  StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[21]  Yiming Yang,et al.  MMD GAN: Towards Deeper Understanding of Moment Matching Network , 2017, NIPS.

[22]  Alexandros G. Dimakis,et al.  AmbientGAN: Generative models from lossy measurements , 2018, ICLR.

[23]  Hyunsoo Kim,et al.  Learning to Discover Cross-Domain Relations with Generative Adversarial Networks , 2017, ICML.

[24]  Jing Dong,et al.  SSGAN: Secure Steganography Based on Generative Adversarial Networks , 2017, PCM.

[25]  Svetlozar T. Rachev,et al.  Duality theorems for Kantorovich-Rubinstein and Wasserstein functionals , 1990 .

[26]  Nan Yang,et al.  Relaxed Wasserstein with Applications to GANs , 2017, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27]  Ming-Yu Liu,et al.  Coupled Generative Adversarial Networks , 2016, NIPS.

[28]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[29]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[30]  Martial Hebert,et al.  The Pose Knows: Video Forecasting by Generating Pose Futures , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[31]  Kenji Fukumizu,et al.  On integral probability metrics, φ-divergences and binary classification , 2009, 0901.2698.

[32]  Sridhar Mahadevan,et al.  Generative Multi-Adversarial Networks , 2016, ICLR.

[33]  Andrea Vedaldi,et al.  It Takes (Only) Two: Adversarial Generator-Encoder Networks , 2017, AAAI.

[34]  Yann LeCun,et al.  Energy-based Generative Adversarial Network , 2016, ICLR.

[35]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[36]  Yingyu Liang,et al.  Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.

[37]  L. Hanin Kantorovich-Rubinstein norm and its application in the theory of Lipschitz spaces , 1992 .

[38]  Olof Mogren,et al.  C-RNN-GAN: Continuous recurrent neural networks with adversarial training , 2016, ArXiv.

[39]  Luke de Oliveira,et al.  Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis , 2017, Computing and Software for Big Science.

[40]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[41]  Marcus Liwicki,et al.  TAC-GAN - Text Conditioned Auxiliary Classifier Generative Adversarial Network , 2017, ArXiv.

[42]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[43]  Rama Chellappa,et al.  A Method for Enforcing Integrability in Shape from Shading Algorithms , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  Patrick Pérez,et al.  Poisson image editing , 2003, ACM Trans. Graph..

[45]  Yu Tsao,et al.  Voice Conversion from Unaligned Corpora Using Variational Autoencoding Wasserstein Generative Adversarial Networks , 2017, INTERSPEECH.

[46]  Song Han,et al.  Deep Generative Adversarial Networks for Compressed Sensing Automates MRI , 2017, ArXiv.

[47]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[48]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[49]  Cheng Li,et al.  Fisher Linear Discriminant Analysis , 2014 .

[50]  Yoshua Bengio,et al.  Mode Regularized Generative Adversarial Networks , 2016, ICLR.

[51]  Bogdan Raducanu,et al.  Invertible Conditional GANs for image editing , 2016, ArXiv.

[52]  Matthias Bethge,et al.  A note on the evaluation of generative models , 2015, ICLR.

[53]  Ian J. Goodfellow,et al.  NIPS 2016 Tutorial: Generative Adversarial Networks , 2016, ArXiv.

[54]  Charalambos D. Aliprantis,et al.  Riesz Representation Theorems , 1999 .

[55]  Thomas Brox,et al.  Synthesizing the preferred inputs for neurons in neural networks via deep generator networks , 2016, NIPS.

[56]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[57]  W. Fenchel On Conjugate Convex Functions , 1949, Canadian Journal of Mathematics.

[58]  C. Barry,et al.  The Role of Hippocampal Replay in Memory and Planning , 2018, Current Biology.

[59]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[60]  Trevor Darrell,et al.  Adversarial Feature Learning , 2016, ICLR.

[61]  François Laviolette,et al.  Domain-Adversarial Neural Networks , 2014, ArXiv.

[62]  Antonio Torralba,et al.  Generating Videos with Scene Dynamics , 2016, NIPS.

[63]  A. Fischer Inverse Reinforcement Learning , 2012 .

[64]  Peter Dayan,et al.  Comparison of Maximum Likelihood and GAN-based training of Real NVPs , 2017, ArXiv.

[65]  Marc G. Bellemare,et al.  The Cramer Distance as a Solution to Biased Wasserstein Gradients , 2017, ArXiv.

[66]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[67]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Sebastian Nowozin,et al.  The Numerics of GANs , 2017, NIPS.

[69]  Ali Farhadi,et al.  SeGAN: Segmenting and Generating the Invisible , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[70]  Chi-Keung Tang,et al.  Conditional CycleGAN for Attribute Guided Face Image Generation , 2017, ArXiv.

[71]  Xiangyang Xue,et al.  Semi-Latent GAN: Learning to generate and modify facial images from attributes , 2017, ArXiv.

[72]  Hui Jiang,et al.  Generating images with recurrent adversarial networks , 2016, ArXiv.

[73]  Ole Winther,et al.  Autoencoding beyond pixels using a learned similarity metric , 2015, ICML.

[74]  Jun Zhu,et al.  Triple Generative Adversarial Nets , 2017, NIPS.

[75]  Jacob D. Abernethy,et al.  How to Train Your DRAGAN , 2017, ArXiv.

[76]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[77]  Stefano Ermon,et al.  A DIRT-T Approach to Unsupervised Domain Adaptation , 2018, ICLR.

[78]  Jiwon Kim,et al.  Continual Learning with Deep Generative Replay , 2017, NIPS.

[79]  Aykut Erdem,et al.  Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts , 2016, ArXiv.

[80]  M. Rosenblatt A CENTRAL LIMIT THEOREM AND A STRONG MIXING CONDITION. , 1956, Proceedings of the National Academy of Sciences of the United States of America.

[81]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[82]  Daguang Xu,et al.  Automatic Vertebra Labeling in Large-Scale 3D CT using Deep Image-to-Image Network with Message Passing and Sparsity Regularization , 2017, IPMI.

[83]  Rob Fergus,et al.  Semi-Supervised Learning with Context-Conditional Generative Adversarial Networks , 2016, ArXiv.

[84]  Lantao Yu,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[85]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[86]  Andreas Krause,et al.  An Online Learning Approach to Generative Adversarial Networks , 2017, ICLR.

[87]  Alán Aspuru-Guzik,et al.  Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models , 2017, ArXiv.

[88]  Yuan Li,et al.  SCAN: Structure Correcting Adversarial Network for Chest X-rays Organ Segmentation , 2017, ArXiv.

[89]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[90]  Rynson W. H. Lau,et al.  VITAL: VIsual Tracking via Adversarial Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[91]  Jan Kautz,et al.  MoCoGAN: Decomposing Motion and Content for Video Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[92]  Kevin Lin,et al.  Adversarial Ranking for Language Generation , 2017, NIPS.

[93]  Otmar Hilliges,et al.  Guiding InfoGAN with Semi-supervision , 2017, ECML/PKDD.

[94]  Yunchao Wei,et al.  Perceptual Generative Adversarial Networks for Small Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[95]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[96]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[97]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[98]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[99]  Francesco Visin,et al.  A guide to convolution arithmetic for deep learning , 2016, ArXiv.

[100]  Vaibhava Goel,et al.  McGan: Mean and Covariance Feature Matching GAN , 2017, ICML.

[101]  Dacheng Tao,et al.  Perceptual Adversarial Networks for Image-to-Image Transformation , 2017, IEEE Transactions on Image Processing.

[102]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[103]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[104]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[105]  Constantinos Daskalakis,et al.  Training GANs with Optimism , 2017, ICLR.

[106]  Saiful Islam,et al.  Mahalanobis Distance , 2009, Encyclopedia of Biometrics.

[107]  Xi Chen,et al.  PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications , 2017, ICLR.

[108]  Kaiqi Huang,et al.  GP-GAN: Towards Realistic High-Resolution Image Blending , 2017, ACM Multimedia.

[109]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[110]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[111]  Marlan O Scully,et al.  Single-shot detection of bacterial endospores via coherent Raman spectroscopy , 2007, Proceedings of the National Academy of Sciences.

[112]  Takeru Miyato,et al.  cGANs with Projection Discriminator , 2018, ICLR.

[113]  Byoungjip Kim,et al.  Unsupervised Visual Attribute Transfer with Reconfigurable Generative Adversarial Networks , 2017, ArXiv.

[114]  Jian Shen,et al.  Adversarial Representation Learning for Domain Adaptation , 2017, ArXiv.

[115]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[116]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[117]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[118]  Ambedkar Dukkipati,et al.  Image Generation and Editing with Variational Info Generative AdversarialNetworks , 2017, ArXiv.

[119]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[120]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[121]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[122]  Guo-Jun Qi,et al.  Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities , 2017, International Journal of Computer Vision.

[123]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[124]  Rishi Sharma,et al.  A Note on the Inception Score , 2018, ArXiv.

[125]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[126]  Jean-Luc Dugelay,et al.  Face aging with conditional generative adversarial networks , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[127]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[128]  Honglak Lee,et al.  Attribute2Image: Conditional Image Generation from Visual Attributes , 2015, ECCV.

[129]  Yiannis Demiris,et al.  MAGAN: Margin Adaptation for Generative Adversarial Networks , 2017, ArXiv.

[130]  Sergey Levine,et al.  A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models , 2016, ArXiv.

[131]  Andrea Vedaldi,et al.  Adversarial Generator-Encoder Networks , 2017, ArXiv.

[132]  Sungroh Yoon,et al.  A SeqGAN for Polyphonic Music Generation , 2017, ArXiv.

[133]  Yi Yang,et al.  GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data , 2017, BMVC 2017.

[134]  Alberto Torchinsky,et al.  A note on the Marcinkiewicz integral , 1990 .

[135]  Jung-Woo Ha,et al.  Energy-Based Sequence GANs for Recommendation and Their Connection to Imitation Learning , 2017, ArXiv.

[136]  Alexander M. Rush,et al.  Adversarially Regularized Autoencoders for Generating Discrete Structures , 2017, ArXiv.

[137]  Sungroh Yoon,et al.  Domain Adaptation Using Adversarial Learning for Autonomous Navigation , 2017 .

[138]  Yoshua Bengio,et al.  Boundary-Seeking Generative Adversarial Networks , 2017, ICLR 2017.

[139]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[140]  Lukasz Kaiser,et al.  Unsupervised Cipher Cracking Using Discrete GANs , 2018, ICLR.

[141]  Subhransu Maji,et al.  3D Shape Induction from 2D Views of Multiple Objects , 2016, 2017 International Conference on 3D Vision (3DV).

[142]  Carl Doersch,et al.  Tutorial on Variational Autoencoders , 2016, ArXiv.

[143]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[144]  Shakir Mohamed,et al.  Variational Approaches for Auto-Encoding Generative Adversarial Networks , 2017, ArXiv.

[145]  Chris Donahue,et al.  Semantically Decomposing the Latent Spaces of Generative Adversarial Networks , 2017, ICLR.

[146]  Tao Xu,et al.  SegAN: Adversarial Network with Multi-scale L1 Loss for Medical Image Segmentation , 2017, Neuroinformatics.

[147]  Charles A. Sutton,et al.  VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning , 2017, NIPS.

[148]  Ran He,et al.  Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[149]  Dumitru Erhan,et al.  Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[150]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[151]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[152]  Xiaoming Liu,et al.  Representation Learning by Rotating Your Faces , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[153]  David Pfau,et al.  Connecting Generative Adversarial Networks and Actor-Critic Methods , 2016, ArXiv.

[154]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.