论文信息 - Cross Domain Adaptation for on-Road Object Detection Using Multimodal Structure-Consistent Image-to-Image Translation

Cross Domain Adaptation for on-Road Object Detection Using Multimodal Structure-Consistent Image-to-Image Translation

Image-to-image translation is potential to boost the detection accuracy of a CNN-based object detector in a different domain. Despite recent GAN (Generative Adversarial Network) based methods have shown compelling visual results, they are prone to fail at preserving image-objects and maintaining structure consistency when faced with large and complex domain shifts such as day-to-night, which reduces their practicality on tasks such as generating large-scale training data for different domains. In this work, we introduce image-translation-structure and cycle-structure consistency for generating diverse and structure-preserved translated images across complex domains, such as between day and night, for object detector training. Given only a single/labelled image at daytime, our model could generate a diverse collection of images at nighttime with different ambient light levels and rear lamp conditions (on/off) but with the same vehicle type, color and locations. Qualitative results show that our model can generate diverse and realistic images in the target domain data. For quantitative comparisons, we evaluate other competing methods and ours by using the generated images to train the Faster R-CNN and YOLO detectors and prove that our model achieves significant improvement and outperforms other methods on detection accuracy.

Che-Tsung Lin | Che-Tsung Lin

[1] Philip Bachman,et al. Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data , 2018, ICML.

[2] Vladlen Koltun,et al. Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[3] 拓海杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[4] Alexei A. Efros,et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5] Dariu Gavrila,et al. EuroCity Persons: A Novel Benchmark for Person Detection in Traffic Scenes , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] Jan Kautz,et al. Multimodal Unsupervised Image-to-Image Translation , 2018, ECCV.

[7] Shang-Hong Lai,et al. Correction to: AugGAN: Cross Domain Adaptation with GAN-Based Data Augmentation , 2018, ECCV 2018.

[8] Dariu Gavrila,et al. The EuroCity Persons Dataset: A Novel Benchmark for Object Detection , 2018, ArXiv.