High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network

Existing image-to-image translation (I2IT) methods are either constrained to low-resolution images or long inference time due to their heavy computational burden on the convolution of high-resolution feature maps. In this paper, we focus on speeding-up the high-resolution photorealistic I2IT tasks based on closed-form Laplacian pyramid decomposition and reconstruction. Specifically, we reveal that the attribute transformations, such as illumination and color manipulation, relate more to the low-frequency component, while the content details can be adaptively refined on high-frequency components. We consequently propose a Laplacian Pyramid Translation Network (LPTN) to simultaneously perform these two tasks, where we design a lightweight network for translating the low-frequency component with reduced resolution and a progressive masking strategy to efficiently refine the high-frequency ones. Our model avoids most of the heavy computation consumed by processing high-resolution feature maps and faithfully preserves the image details. Extensive experimental results on various tasks demonstrate that the proposed method can translate 4K images in real-time using one normal GPU while achieving comparable transformation performance against existing methods. Datasets and codes are available: https://github.com/csjliang/LPTN.

[1]  Alexei A. Efros,et al.  Real-time user-guided image colorization with learned deep priors , 2017, ACM Trans. Graph..

[2]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[3]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[4]  Jan Kautz,et al.  Learning Affinity via Spatial Propagation Networks , 2017, NIPS.

[5]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[6]  Yu-Chiang Frank Wang,et al.  A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation , 2018, NeurIPS.

[7]  Yung-Yu Chuang,et al.  Deep Photo Enhancer: Unpaired Learning for Image Enhancement from Photographs with GANs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Joost van de Weijer,et al.  Image-to-image translation for cross-domain disentanglement , 2018, NeurIPS.

[10]  Jan Kautz,et al.  Learning Linear Transformations for Fast Image and Video Style Transfer , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Maneesh Kumar Singh,et al.  DRIT++: Diverse Image-to-Image Translation via Disentangled Representations , 2019, International Journal of Computer Vision.

[12]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Patrick Pérez,et al.  A Flexible Convolutional Solver for Fast Style Transfers , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jung-Woo Ha,et al.  Photorealistic Style Transfer via Wavelet Transforms , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[16]  Narendra Ahuja,et al.  Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Jonathan T. Barron,et al.  Deep bilateral learning for real-time image enhancement , 2017, ACM Trans. Graph..

[18]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[20]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[21]  Jan Kautz,et al.  Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[22]  Harshad Rai,et al.  Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .

[23]  Sylvain Paris,et al.  Deep Photo Style Transfer , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Hao He,et al.  Exposure , 2017, ACM Trans. Graph..

[25]  Luc Van Gool,et al.  Exemplar Guided Unsupervised Image-to-Image Translation with Semantic Consistency , 2018, ICLR.

[26]  Nenghai Yu,et al.  Coherent Online Video Style Transfer , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  Charless C. Fowlkes,et al.  Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation , 2016, ECCV.

[28]  Xueting Li,et al.  A Closed-form Solution to Photorealistic Image Stylization , 2018, ECCV.

[29]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[32]  Hao Wang,et al.  Real-Time Neural Style Transfer for Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Rui Zhang,et al.  Harmonic Unpaired Image-to-image Translation , 2019, ICLR.

[34]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.