Focal Frequency Loss for Image Reconstruction and Synthesis

Image reconstruction and synthesis have witnessed remarkable progress thanks to the development of generative models. Nonetheless, gaps could still exist between the real and generated images, especially in the frequency domain. In this study, we show that narrowing gaps in the frequency domain can ameliorate image reconstruction and synthesis quality further. We propose a novel focal frequency loss, which allows a model to adaptively focus on frequency components that are hard to synthesize by down-weighting the easy ones. This objective function is complementary to existing spatial losses, offering great impedance against the loss of important frequency information due to the inherent bias of neural networks. We demonstrate the versatility and effectiveness of focal frequency loss to improve popular models, such as VAE, pix2pix, and SPADE, in both perceptual quality and quantitative performance. We further show its potential on StyleGAN2. 1, 2

[1]  Shijian Lu,et al.  RDA: Robust Domain Adaptation via Fourier Adversarial Attacking , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Liujuan Cao,et al.  DeeperForensics Challenge 2020 on Real-World Face Forgery Detection: Methods and Results , 2021, ArXiv.

[3]  Daniel Cohen-Or,et al.  SWAGAN , 2021, ACM Trans. Graph..

[4]  Margret Keuper,et al.  Spectral Distribution Aware Image Generation , 2020, AAAI.

[5]  Gao Huang,et al.  Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving , 2020, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Lei Ma,et al.  FakeRetouch: Evading DeepFakes Detection via the Guidance of Deliberate Noise , 2020, ArXiv.

[7]  David D. Cox,et al.  not-so-BigGAN: Generating High-Fidelity Images on a Small Compute Budget , 2020, ArXiv.

[8]  Chunxiao Liu,et al.  TSIT: A Simple and Versatile Framework for Image-to-Image Translation , 2020, ECCV.

[9]  Jan Kautz,et al.  NVAE: A Deep Hierarchical Variational Autoencoder , 2020, NeurIPS.

[10]  Polina Golland,et al.  Joint Frequency- and Image-Space Learning for Fourier Imaging , 2020, ArXiv.

[11]  Jonathan T. Barron,et al.  Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains , 2020, NeurIPS.

[12]  Tie-Yan Liu,et al.  Invertible Image Rescaling , 2020, ECCV.

[13]  Stanislav Pidhorskyi,et al.  Adversarial Latent Autoencoders , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Shuhang Gu,et al.  Unsupervised Real-world Image Super Resolution via Domain-distance Aware Training , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[16]  Dorothea Kolossa,et al.  Leveraging Frequency Analysis for Deep Fake Image Recognition , 2020, ICML.

[17]  Margret Keuper,et al.  Watch Your Up-Convolution: CNN Based Generative Deep Neural Networks Are Failing to Reproduce Spectral Distributions , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Yuhao Wang,et al.  Learning in the Frequency Domain , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Chen Change Loy,et al.  DeeperForensics-1.0: A Large-Scale Dataset for Real-World Face Forgery Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Alexei A. Efros,et al.  CNN-Generated Images Are Surprisingly Easy to Spot… for Now , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Tero Karras,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Radu Timofte,et al.  Frequency Separation for Real-World Super-Resolution , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[23]  Xiaogang Wang,et al.  Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis , 2019, NeurIPS.

[24]  Xu Zhang,et al.  Detecting and Simulating Artifacts in GAN Fake Images , 2019, 2019 IEEE International Workshop on Information Forensics and Security (WIFS).

[25]  Eric P. Xing,et al.  High-Frequency Component Helps Explain the Generalization of Convolutional Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Jung-Woo Ha,et al.  Photorealistic Style Transfer via Wavelet Transforms , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Taesung Park,et al.  Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Zheng Ma,et al.  Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks , 2019, Communications in Computational Physics.

[29]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[31]  Yuning Jiang,et al.  Unified Perceptual Parsing for Scene Understanding , 2018, ECCV.

[32]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[33]  Yoshua Bengio,et al.  On the Spectral Bias of Neural Networks , 2018, ICML.

[34]  Han Zhang,et al.  Self-Attention Generative Adversarial Networks , 2018, ICML.

[35]  Vladlen Koltun,et al.  Semi-Parametric Image Synthesis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Jan Kautz,et al.  Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[37]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[38]  Jason Yosinski,et al.  Faster Neural Networks Straight from JPEG , 2018, NeurIPS.

[39]  Sebastian Nowozin,et al.  Which Training Methods for GANs do actually Converge? , 2018, ICML.

[40]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[41]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[44]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[45]  V. Koltun,et al.  Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[46]  Bolei Zhou,et al.  Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[48]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[49]  Thomas A. Funkhouser,et al.  Dilated Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[51]  Dacheng Tao,et al.  Packing Convolutional Neural Networks in the Frequency Domain , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Vincent Dumoulin,et al.  Deconvolution and Checkerboard Artifacts , 2016 .

[54]  Yixin Chen,et al.  Compressing Convolutional Neural Networks in the Frequency Domain , 2015, KDD.

[55]  Andrea Vedaldi,et al.  Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[56]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[57]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[58]  Abhinav Gupta,et al.  Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[61]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[63]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[64]  Saining Xie,et al.  Holistically-Nested Edge Detection , 2015, International Journal of Computer Vision.

[65]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[66]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[67]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[68]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[69]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[70]  Kristen Grauman,et al.  Fine-Grained Visual Comparisons with Local Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[71]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[72]  Aaron C. Courville,et al.  Generative Adversarial Nets , 2014, NIPS.

[73]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[74]  Iasonas Kokkinos,et al.  Describing Textures in the Wild , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[75]  Radim Sára,et al.  Spatial Pattern Templates for Recognition of Objects with Regular Structure , 2013, GCPR.

[76]  A. Levinskis,et al.  Convolutional Neural Network Feature Reduction using Wavelet Transform , 2013 .

[77]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[78]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[79]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[80]  Harshad Rai,et al.  Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .