Scene Adaptation for Semantic Segmentation using Adversarial Learning

Semantic Segmentation algorithms based on the deep learning paradigm have reached outstanding performances. However, in order to achieve good results in a new domain, it is generally demanded to fine-tune a pre-trained deep architecture using new labeled data coming from the target application domain. The fine-tuning procedure is also required when the domain application settings change, e. g., when a camera is moved, or a new camera is installed. This implies the collection and pixel-wise la-beling of images to be used for training, which slows down the deployment of semantic segmentation systems in real industrial scenarios and increases the industrial costs. Taking into account the aforementioned issues, in this paper we propose an approach based on Adversarial Learning to perform scene adaptation for semantic segmentation. We frame scene adaptation as the task of predicting semantic segmentation masks for images belonging to a Target Scene Context given labeled images coming from a Source Scene Context and unlabeled images coming from the Target Scene Context. Experiments highlight that the proposed method achieves promising performances both when the two scenes contain similar content (i.e., they are related to two different points of view of the same scene) and when the observed scenes contain unrelated content (i.e., they account to completely different scenes).

[1]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling , 2015, CVPR 2015.

[2]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[4]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[5]  Chuan Li,et al.  Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks , 2016, ECCV.

[6]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[8]  Gabriel Kreiman,et al.  Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning , 2016, ICLR.

[9]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[10]  Giovanni Maria Farinella,et al.  On-board monitoring system for road traffic safety analysis , 2018, Comput. Ind..

[11]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[12]  Giovanni Maria Farinella,et al.  Semantic segmentation of images exploiting DCT based features and random forest , 2016, Pattern Recognit..

[13]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Swami Sankaranarayanan,et al.  Unsupervised Domain Adaptation for Semantic Segmentation with GANs , 2017, ArXiv.

[17]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[18]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Yann LeCun,et al.  Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[20]  Trevor Darrell,et al.  FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation , 2016, ArXiv.

[21]  Jean-François Raymond,et al.  Traffic Analysis: Protocols, Attacks, Design Issues, and Open Problems , 2000, Workshop on Design Issues in Anonymity and Unobservability.

[22]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.