Learning Across Tasks and Domains

Recent works have proven that many relevant visual tasks are closely related one to another. Yet, this connection is seldom deployed in practice due to the lack of practical methodologies to transfer learned concepts across different training processes. In this work, we introduce a novel adaptation framework that can operate across both task and domains. Our framework learns to transfer knowledge across tasks in a fully supervised domain (e.g., synthetic data) and use this knowledge on a different domain where we have only partial supervision (e.g., real data). Our proposal is complementary to existing domain adaptation techniques and extends them to cross tasks scenarios providing additional performance gains. We prove the effectiveness of our framework across two challenging tasks (i.e., monocular depth estimation and semantic segmentation) and four different domains (Synthia, Carla, Kitti, and Cityscapes).

[1]  Li Fei-Fei,et al.  Dynamic Task Prioritization for Multitask Learning , 2018, ECCV.

[2]  Philip David,et al.  Domain Adaptation for Semantic Segmentation of Urban Scenes , 2017 .

[3]  Thomas A. Funkhouser,et al.  Dilated Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Marc Pollefeys,et al.  Slanted Stixels: Representing San Francisco's Steepest Streets , 2017, BMVC.

[5]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[6]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Andrea Vedaldi,et al.  Supervising the New with the Old: Learning SFM from SFM , 2018, ECCV.

[10]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[12]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[13]  Senthil Yogamani,et al.  AuxNet: Auxiliary tasks enhanced Semantic Segmentation for Automated Driving , 2019, VISIGRAPP.

[14]  Roberto Cipolla,et al.  Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Leonidas J. Guibas,et al.  Taskonomy: Disentangling Task Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Trevor Darrell,et al.  Simultaneous Deep Transfer Across Domains and Tasks , 2015, ICCV.

[17]  Andreas Geiger,et al.  Augmented Reality Meets Computer Vision: Efficient Data Generation for Urban Driving Scenes , 2017, International Journal of Computer Vision.

[18]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[19]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Luigi di Stefano,et al.  Geometry meets semantics for semi-supervised monocular depth estimation , 2018, ACCV.

[21]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[24]  Trevor Darrell,et al.  FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation , 2016, ArXiv.

[25]  Swami Sankaranarayanan,et al.  Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Jianfei Cai,et al.  T2Net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks , 2018, ECCV.

[27]  Jörg Stückler,et al.  Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry , 2018, ECCV.

[28]  Mei Wang,et al.  Deep Visual Domain Adaptation: A Survey , 2018, Neurocomputing.

[29]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Andreas Geiger,et al.  Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Kaiming He,et al.  Rethinking ImageNet Pre-Training , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[33]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34]  Luigi di Stefano,et al.  Exploiting semantics in adversarial training for image-level domain adaptation , 2018, 2018 IEEE International Conference on Image Processing, Applications and Systems (IPAS).

[35]  Rama Chellappa,et al.  Domain adaptation for object recognition: An unsupervised approach , 2011, 2011 International Conference on Computer Vision.

[36]  Dong Liu,et al.  Fully Convolutional Adaptation Networks for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[38]  Fei-Fei Li,et al.  Label Efficient Learning of Transferable Representations acrosss Domains and Tasks , 2017, NIPS.

[39]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[40]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[41]  H. Hirschmüller Accurate and Efficient Stereo Processing by Semi-Global Matching and Mutual Information , 2005, CVPR.

[42]  Iasonas Kokkinos,et al.  UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Vineeth N. Balasubramanian,et al.  Zero-Shot Task Transfer , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Ming-Hsuan Yang,et al.  Learning to Adapt Structured Output Space for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Trevor Darrell,et al.  Simultaneous Deep Transfer Across Domains and Tasks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[46]  Dumitru Erhan,et al.  Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[48]  Andrew Zisserman,et al.  Multi-task Self-Supervised Visual Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[49]  Luigi di Stefano,et al.  Unsupervised Adaptation for Deep Stereo , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[50]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.