Incremental focal loss GANs

Abstract Generative Adversarial Networks (GANs) have achieved inspiring performance in both unsupervised image generation and conditional cross-modal image translation. However, how to generate quality images at an affordable cost is still challenging. We argue that it is the vast number of easy examples that disturb training of GANs, and propose to address this problem by down-weighting losses assigned to easy examples. Our novel Incremental Focal Loss (IFL) progressively focuses training on hard examples and prevents easy examples from overwhelming the generator and discriminator during training. In addition, we propose an enhanced self-attention (ESA) mechanism to boost the representational capacity of the generator. We apply IFL and ESA to a number of unsupervised and conditional GANs, and conduct experiments on various tasks, including face photo-sketch synthesis, map↔aerial-photo translation, single image super-resolution reconstruction, and image generation on CelebA, LSUN, and CIFAR-10. Results show that IFL boosts learning of GANs over existing loss functions. Besides, both IFL and ESA make GANs produce quality images with realistic details in all these tasks, even when no task adaptation is involved.

[1]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[3]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Dacheng Tao,et al.  Attention-GAN for Object Transfiguration in Wild Images , 2018, ECCV.

[5]  Michael Elad,et al.  On Single Image Scale-Up Using Sparse-Representations , 2010, Curves and Surfaces.

[6]  Dustin Tran,et al.  Image Transformer , 2018, ICML.

[7]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[9]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[10]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[11]  Jianping Fan,et al.  iPrivacy: Image Privacy Protection by Identifying Sensitive Objects via Deep Multi-Task Learning , 2017, IEEE Transactions on Information Forensics and Security.

[12]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[13]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[15]  Zunlei Feng,et al.  Neural Style Transfer: A Review , 2017, IEEE Transactions on Visualization and Computer Graphics.

[16]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Yunsong Li,et al.  Deep Latent Low-Rank Representation for Face Sketch Synthesis , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[18]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[19]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[20]  Pieter Abbeel,et al.  PixelSNAIL: An Improved Autoregressive Generative Model , 2017, ICML.

[21]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[22]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  David Zhang,et al.  FSIM: A Feature Similarity Index for Image Quality Assessment , 2011, IEEE Transactions on Image Processing.

[24]  Joan Bruna,et al.  Super-Resolution with Deep Convolutional Sufficient Statistics , 2015, ICLR.

[25]  Radim Sára,et al.  Spatial Pattern Templates for Recognition of Objects with Regular Structure , 2013, GCPR.

[26]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Mario Lucic,et al.  Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.

[28]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Fei Gao,et al.  Deep Multimodal Distance Metric Learning Using Click Constraints for Image Ranking , 2017, IEEE Transactions on Cybernetics.

[30]  Xinbo Gao,et al.  Random sampling for fast face sketch synthesis , 2017, Pattern Recognit..

[31]  Xiaohua Zhai,et al.  The GAN Landscape: Losses, Architectures, Regularization, and Normalization , 2018, ArXiv.

[32]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[33]  Xuelong Li,et al.  A Comprehensive Survey to Face Hallucination , 2013, International Journal of Computer Vision.

[34]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Xiaogang Wang,et al.  StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Ilya Sutskever,et al.  Generating Long Sequences with Sparse Transformers , 2019, ArXiv.

[37]  Jun Yu,et al.  Hierarchical Deep Click Feature Prediction for Fine-Grained Image Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Dumitru Erhan,et al.  Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Jun Yu,et al.  Multitask Autoencoder Model for Recovering Human Poses , 2018, IEEE Transactions on Industrial Electronics.

[40]  Yue Gao,et al.  Robust Face Sketch Synthesis via Generative Adversarial Fusion of Priors and Parametric Sigmoid , 2018, IJCAI.

[41]  Xiaokang Yang,et al.  Gated-GAN: Adversarial Gated Networks for Multi-Collection Style Transfer , 2019, IEEE Transactions on Image Processing.

[42]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[43]  Xiaohua Zhai,et al.  High-Fidelity Image Generation With Fewer Labels , 2019, ICML.

[44]  Bin Song,et al.  Evaluation on synthesized face sketches , 2016, Neurocomputing.

[45]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[46]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[47]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[48]  Xinbo Gao,et al.  Neural Probabilistic Graphical Model for Face Sketch Synthesis , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[49]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Zhe Gan,et al.  AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]  Yochai Blau,et al.  The Perception-Distortion Tradeoff , 2017, CVPR.

[52]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[53]  Xiaogang Wang,et al.  Face Photo-Sketch Synthesis and Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.