CoPE: Conditional image generation using Polynomial Expansions

Generative modeling has evolved to a notable field of machine learning. Deep polynomial neural networks (PNNs) have demonstrated impressive results in unsupervised image generation, where the task is to map an input vector (i.e., noise) to a synthesized image. However, the success of PNNs has not been replicated in conditional generation tasks, such as super-resolution. Existing PNNs focus on single-variable polynomial expansions which do not fare well to two-variable inputs, i.e., the noise variable and the conditional variable. In this work, we introduce a general framework, called CoPE, that enables a polynomial expansion of two input variables and captures their autoand cross-correlations. We exhibit how CoPE can be trivially augmented to accept an arbitrary number of input variables. CoPE is evaluated in five tasks (class-conditional generation, inverse problems, edges-to-image translation, image-to-image translation, attribute-guided generation) involving eight datasets. The thorough evaluation suggests that CoPE can be useful for tackling diverse conditional generation tasks.

[1]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[2]  Tero Karras,et al.  Training Generative Adversarial Networks with Limited Data , 2020, NeurIPS.

[3]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[4]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[6]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[7]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[8]  Ting Chen,et al.  On Self Modulation for Generative Adversarial Networks , 2018, ICLR.

[9]  Matthias Bethge,et al.  A note on the evaluation of generative models , 2015, ICLR.

[10]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[12]  Klemen Grm,et al.  Face Hallucination Using Cascaded Super-Resolution and Identity Priors , 2018, IEEE Transactions on Image Processing.

[13]  Philip Bachman,et al.  Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data , 2018, ICML.

[14]  M. Stone The Generalized Weierstrass Approximation Theorem , 1948 .

[15]  Guang Li,et al.  CT Super-Resolution GAN Constrained by the Identical, Residual, and Cycle Learning Ensemble (GAN-CIRCLE) , 2018, IEEE Transactions on Medical Imaging.

[16]  Steffen Rendle,et al.  Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.

[17]  Shijian Lu,et al.  Spatial Fusion GAN for Image Synthesis , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Xingquan Zhu,et al.  A survey and taxonomy of adversarial neural networks for text‐to‐image synthesis , 2019, WIREs Data Mining Knowl. Discov..

[20]  Maja Pantic,et al.  Multilinear Latent Conditioning for Generating Unseen Attribute Combinations , 2020, ICML.

[21]  Jung-Woo Ha,et al.  StarGAN v2: Diverse Image Synthesis for Multiple Domains , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[24]  Mirabela Rusu,et al.  An Application of Generative Adversarial Networks for Super Resolution Medical Imaging , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[25]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Yu-Ding Lu,et al.  DRIT++: Diverse Image-to-Image Translation via Disentangled Representations , 2020, International Journal of Computer Vision.

[27]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[28]  Max Welling,et al.  VAE with a VampPrior , 2017, AISTATS.

[29]  Stefanos Zafeiriou,et al.  PolyGAN: High-Order Polynomial Generators , 2019, ArXiv.

[30]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  Frédo Durand,et al.  Understanding and evaluating blind deconvolution algorithms , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Dong Liang,et al.  PCGAN: Partition-Controlled Human Image Generation , 2018, AAAI.

[33]  Yee Whye Teh,et al.  Multiplicative Interactions and Where to Find Them , 2020, ICLR.

[34]  Mario Lucic,et al.  Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.

[35]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[36]  Kun Xu,et al.  A survey of image synthesis and editing with generative adversarial networks , 2017 .

[37]  Ran He,et al.  Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38]  Taesung Park,et al.  Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[40]  Tom White,et al.  Generative Adversarial Networks: An Overview , 2017, IEEE Signal Processing Magazine.

[41]  Jonathon Shlens,et al.  A Learned Representation For Artistic Style , 2016, ICLR.

[42]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[43]  Jan Kautz,et al.  Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[44]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[45]  Stefanos Zafeiriou,et al.  Π-nets: Deep Polynomial Neural Networks , 2020, ArXiv.

[46]  Alexandros G. Dimakis,et al.  Deep Learning Techniques for Inverse Problems in Imaging , 2020, IEEE Journal on Selected Areas in Information Theory.

[47]  Nicu Sebe,et al.  Deformable GANs for Pose-Based Human Image Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48]  Qiaojing Yan DCGANs for image super-resolution , denoising and debluring , 2017 .

[49]  Joydeep Ghosh,et al.  The pi-sigma network: an efficient higher-order neural network for pattern classification and function approximation , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[50]  Takeru Miyato,et al.  cGANs with Projection Discriminator , 2018, ICLR.

[51]  Luc Van Gool,et al.  Pose Guided Person Image Generation , 2017, NIPS.

[52]  Seunghoon Hong,et al.  Diversity-Sensitive Conditional Generative Adversarial Networks , 2019, ICLR.

[53]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Xin Yu,et al.  Super-Resolving Very Low-Resolution Face Images with Supplementary Attributes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[56]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[57]  Hugo Larochelle,et al.  Modulating early visual processing by language , 2017, NIPS.

[58]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[59]  Takuhiro Kaneko,et al.  Label-Noise Robust Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[61]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[62]  Harshad Rai,et al.  Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .

[63]  Yingtao Tian,et al.  Towards the High-quality Anime Characters Generation with Generative Adversarial Networks , 2017 .

[64]  Junsoo Ha,et al.  Harmonizing Maximum Likelihood with GANs for Multimodal Conditional Generation , 2018, ICLR.

[65]  Siwei Ma,et al.  Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Ke Chen,et al.  An Optimization-Based Multilevel Algorithm for Total Variation Image Denoising , 2006, Multiscale Model. Simul..

[67]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[68]  Jonathan Krause,et al.  3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[69]  Kaiqi Huang,et al.  GP-GAN: Towards Realistic High-Resolution Image Blending , 2017, ACM Multimedia.

[70]  Xiaoming Liu,et al.  Disentangled Representation Learning GAN for Pose-Invariant Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[71]  Stefano Ermon,et al.  Bias and Generalization in Deep Generative Models: An Empirical Study , 2018, NeurIPS.

[72]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[73]  Kristen Grauman,et al.  Fine-Grained Visual Comparisons with Local Learning , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[74]  Sepp Hochreiter,et al.  GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[75]  Jaakko Lehtinen,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Ismail Elezi,et al.  CIAGAN: Conditional Identity Anonymization Generative Adversarial Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[77]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[78]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[79]  Deqing Sun,et al.  Learning to Super-Resolve Blurry Face and Text Images , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).