Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation

Visual Domain Adaptation is a problem of immense importance in computer vision. Previous approaches showcase the inability of even deep neural networks to learn informative representations across domain shift. This problem is more severe for tasks where acquiring hand labeled data is extremely hard and tedious. In this work, we focus on adapting the representations learned by segmentation networks across synthetic and real domains. Contrary to previous approaches that use a simple adversarial objective or superpixel information to aid the process, we propose an approach based on Generative Adversarial Networks (GANs) that brings the embeddings closer in the learned feature space. To showcase the generality and scalability of our approach, we show that we can achieve state of the art results on two challenging scenarios of synthetic to real domain adaptation. Additional exploratory experiments show that our approach: (1) generalizes to unseen domains and (2) results in improved alignment of source and target distributions.

[1]  Rama Chellappa,et al.  Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Leon A. Gatys,et al.  A Neural Algorithm of Artistic Style , 2015, ArXiv.

[3]  Sanja Fidler,et al.  The Role of Context for Object Detection and Semantic Segmentation in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Dumitru Erhan,et al.  Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Carlos D. Castillo,et al.  Generate to Adapt: Aligning Domains Using Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Trevor Darrell,et al.  FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation , 2016, ArXiv.

[7]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[8]  Philip David,et al.  Domain Adaptation for Semantic Segmentation of Urban Scenes , 2017 .

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Roberto Cipolla,et al.  Semantic object classes in video: A high-definition ground truth database , 2009, Pattern Recognit. Lett..

[11]  Jiasen Lu,et al.  Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.

[12]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Michael I. Jordan,et al.  Unsupervised Domain Adaptation with Residual Transfer Networks , 2016, NIPS.

[14]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[15]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[18]  Vladlen Koltun,et al.  Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[19]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[20]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[21]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[22]  Bo Geng,et al.  DAML: Domain Adaptation Metric Learning , 2011, IEEE Transactions on Image Processing.

[23]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[24]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[25]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Rama Chellappa,et al.  Domain adaptation for object recognition: An unsupervised approach , 2011, 2011 International Conference on Computer Vision.

[29]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[31]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[32]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[33]  Ming-Yu Liu,et al.  Coupled Generative Adversarial Networks , 2016, NIPS.