Weakly Paired Multi-Domain Image Translation

In this paper, we aim at studying the new problem of weakly paired multi-domain image translation. To this end, we collect a dataset that contains weakly paired images from multiple domains. Two images are considered to be weakly paired if they are captured from nearby locations and share an overlapping field of view. These images are possibly captured by two asynchronous cameras—often resulting in images from separate domains, e.g. summer and winter. Major motivations for using weakly paired images are: (i) performance improvement towards that of paired data; (ii) cheap labels and abundant data availability. For the first time in this paper, we propose a multi-domain image translation method specifically designed for weakly paired data. The proposed method consists of an attention-based generator and a two-stream discriminator that deals with misalignment between source and target images. Our method generates images in the target domain while preserving source image content, including foreground objects such as cars and pedestrians. Our extensive experiments demonstrate the superiority of the proposed method in comparison to the state-of-the-art. The new dataset and the source code are available at https://github.com/zhangma123/weaklypaired.

[1]  Vladlen Koltun,et al.  Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Edward Y. Chang,et al.  RelGAN: Multi-Domain Image-to-Image Translation via Relative Attributes , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Paul Newman,et al.  1 year, 1000 km: The Oxford RobotCar dataset , 2017, Int. J. Robotics Res..

[5]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[6]  Luc Van Gool,et al.  SMIT: Stochastic Multi-Label Image-to-Image Translation , 2018, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[7]  Luc Van Gool,et al.  Night-to-Day Image Translation for Retrieval-based Localization , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[8]  Jung-Woo Ha,et al.  StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Taesung Park,et al.  Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Masatoshi Okutomi,et al.  Visual Place Recognition with Repetitive Structures , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Keren Fu,et al.  An Efficient Multi-Domain Framework for Image-to-Image Translation , 2019, ArXiv.

[13]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[14]  Luc Van Gool,et al.  ComboGAN: Unrestrained Scalability for Image Domain Translation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[15]  Jan Kautz,et al.  Unsupervised Image-to-Image Translation Networks , 2017, NIPS.

[16]  Luc Van Gool,et al.  Guided Curriculum Model Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Wenkai Huang,et al.  GD-StarGAN: Multi-domain image-to-image translation in garment design , 2020, PloS one.

[18]  Qing Li,et al.  Unpaired Multi-Domain Image Generation via Regularized Conditional GANs , 2018, IJCAI.

[19]  Nicu Sebe,et al.  Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation , 2018, ACCV.

[20]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[22]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Jung-Woo Ha,et al.  StarGAN v2: Diverse Image Synthesis for Multiple Domains , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[25]  Tomás Pajdla,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Yu-Chiang Frank Wang,et al.  A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation , 2018, NeurIPS.

[27]  Le Hui,et al.  Unsupervised Multi-Domain Image Translation with Domain-Specific Encoders/Decoders , 2017, 2018 24th International Conference on Pattern Recognition (ICPR).

[28]  Luc Van Gool,et al.  Geometrically Mappable Image Features , 2020, IEEE Robotics and Automation Letters.

[29]  Francesc Moreno-Noguer,et al.  GANimation: Anatomically-aware Facial Animation from a Single Image , 2018, ECCV.