STDGAN: ResBlock Based Generative Adversarial Nets Using Spectral Normalization and Two Different Discriminators

Generative adversarial network (GAN) is a powerful generative model. However, it suffers from two key problems, which are convergence and mode collapse. To overcome these drawbacks, this paper presents a novel architecture of GAN, called STDGAN, which consists of one generator and two different discriminators. With the fact that GAN is the analogy of a minimax game, the proposed architecture is as follows. The generator G aims to produce realistic-looking samples to fool both of two discriminators. The first discriminator D1 rewards high scores for the samples from the data distribution, while the second one D2 favors the samples from the generator conversely. Specifically, the minibatch discrimination and Spectral Normalization (SN) are first adopted in D1. Then, based on the ResBlock architecture, Spectral Normalization (SN) and Scaled Exponential Linear Units (SELU) are adopted in the first and last half layers of D2 respectively. In particular, a novel loss function is designed to optimize the STDGAN by minimizing the KL divergence. Extensive experiments on CIFAR-10/100 and ImageNet datasets demonstrate that the proposed STDGAN can effectively solve the problems of convergence and mode collapse and obtain the higher inception score (IS) and lower Frechet Inception Distance (FID) compared with other state-of-the-art GANs.

[1]  Sina Honari,et al.  Distribution Matching Losses Can Hallucinate Features in Medical Image Translation , 2018, MICCAI.

[2]  Yoshua Bengio,et al.  Mode Regularized Generative Adversarial Networks , 2016, ICLR.

[3]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Dinesh Manocha,et al.  Dynamic Sound Field Synthesis for Speech and Music Optimization , 2018, ACM Multimedia.

[6]  Ye Wang,et al.  SLIONS: A Karaoke Application to Enhance Foreign Language Learning , 2018, ACM Multimedia.

[7]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[8]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[9]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[10]  Trung Le,et al.  Dual Discriminator Generative Adversarial Nets , 2017, NIPS.

[11]  Changsheng Xu,et al.  A Unified Generative Adversarial Framework for Image Generation and Person Re-identification , 2018, ACM Multimedia.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[14]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[15]  Joseph O. Deasy,et al.  Tumor-Aware, Adversarial Domain Adaptation from CT to MRI for Lung Cancer Segmentation , 2018, MICCAI.

[16]  Sheng Tang,et al.  Style Separation and Synthesis via Generative Adversarial Networks , 2018, ACM Multimedia.

[17]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[18]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[19]  Qi Tian,et al.  Unregularized Auto-Encoder with Generative Adversarial Networks for Image Generation , 2018, ACM Multimedia.

[20]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[21]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[23]  Yoshua Bengio,et al.  Improving Generative Adversarial Networks with Denoising Feature Matching , 2016, ICLR.

[24]  Xin Wang,et al.  Crossing-Domain Generative Adversarial Networks for Unsupervised Multi-Domain Image-to-Image Translation , 2018, ACM Multimedia.

[25]  Andrew Gordon Wilson,et al.  Bayesian GAN , 2017, NIPS.

[26]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[27]  Xiaohua Zhai,et al.  The GAN Landscape: Losses, Architectures, Regularization, and Normalization , 2018, ArXiv.

[28]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[29]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[30]  Jacob Abernethy,et al.  On Convergence and Stability of GANs , 2018 .

[31]  Matthias Bethge,et al.  A note on the evaluation of generative models , 2015, ICLR.

[32]  Marc G. Bellemare,et al.  The Cramer Distance as a Solution to Biased Wasserstein Gradients , 2017, ArXiv.

[33]  Hanqing Lu,et al.  Sketch-based Image Retrieval using Generative Adversarial Networks , 2017, ACM Multimedia.

[34]  Kenta Oono,et al.  Chainer : a Next-Generation Open Source Framework for Deep Learning , 2015 .

[35]  Ngai-Man Cheung,et al.  Improving GAN with neighbors embedding and gradient matching , 2018, AAAI.

[36]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[37]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[38]  David Berthelot,et al.  BEGAN: Boundary Equilibrium Generative Adversarial Networks , 2017, ArXiv.

[39]  Youbao Tang,et al.  CT-Realistic Lung Nodule Simulation from 3D Conditional Generative Adversarial Networks for Robust Lung Segmentation , 2018, MICCAI.