Episodic Training for Domain Generalization

Domain generalization (DG) is the challenging and topical problem of learning models that generalize to novel testing domains with different statistics than a set of known training domains. The simple approach of aggregating data from all source domains and training a single deep neural network end-to-end on all the data provides a surprisingly strong baseline that surpasses many prior published methods. In this paper we build on this strong baseline by designing an episodic training procedure that trains a single deep network in a way that exposes it to the domain shift that characterises a novel domain at runtime. Specifically, we decompose a deep network into feature extractor and classifier components, and then train each component by simulating it interacting with a partner who is badly tuned for the current domain. This makes both components more robust, ultimately leading to our networks producing state-of-the-art performance on three DG benchmarks. Furthermore, we consider the pervasive workflow of using an ImageNet trained CNN as a fixed feature extractor for downstream recognition tasks. Using the Visual Decathlon benchmark, we demonstrate that our episodic-DG training improves the performance of such a general purpose feature extractor by explicitly training a feature for robustness to novel problems. This shows that DG training can benefit standard practice in computer vision.

[1]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[2]  Michael I. Jordan,et al.  Unsupervised Domain Adaptation with Residual Transfer Networks , 2016, NIPS.

[3]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[4]  Silvio Savarese,et al.  Generalizing to Unseen Domains via Adversarial Data Augmentation , 2018, NeurIPS.

[5]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[6]  Siddhartha Chaudhuri,et al.  Generalizing Across Domains via Cross-Gradient Training , 2018, ICLR.

[7]  Swami Sankaranarayanan,et al.  MetaReg: Towards Domain Generalization using Meta-Regularization , 2018, NeurIPS.

[8]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[9]  Barbara Caputo,et al.  Best Sources Forward: Domain Generalization through Source-Specific Nets , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[10]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[12]  Yongxin Yang,et al.  Learning to Generalize: Meta-Learning for Domain Generalization , 2017, AAAI.

[13]  Pieter Abbeel,et al.  Meta-Learning with Temporal Convolutions , 2017, ArXiv.

[14]  Andrea Vedaldi,et al.  Learning multiple visual domains with residual adapters , 2017, NIPS.

[15]  Stefano Soatto,et al.  Entropy-SGD: biasing gradient descent into wide valleys , 2016, ICLR.

[16]  Antonio Torralba,et al.  Exploiting hierarchical context on a large database of object categories , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Dong Xu,et al.  Exploiting Low-Rank Structure from Latent Domains for Domain Generalization , 2014, ECCV.

[18]  George Trigeorgis,et al.  Domain Separation Networks , 2016, NIPS.

[19]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[20]  Yongxin Yang,et al.  Deeper, Broader and Artier Domain Generalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Ye Xu,et al.  Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias , 2013, 2013 IEEE International Conference on Computer Vision.

[22]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[23]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[24]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[25]  Yongxin Yang,et al.  A Unified Perspective on Multi-Domain and Multi-Task Learning , 2014, ICLR.

[26]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[27]  Rémi Ronfard,et al.  Free viewpoint action recognition using motion history volumes , 2006, Comput. Vis. Image Underst..

[28]  Wei Zhou,et al.  Feature-Critic Networks for Heterogeneous Domain Generalization , 2019, ICML.

[30]  Alexei A. Efros,et al.  Undoing the Damage of Dataset Bias , 2012, ECCV.

[31]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[32]  Huchuan Lu,et al.  Deep Mutual Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Donald A. Adjeroh,et al.  Unified Deep Supervised Domain Adaptation and Generalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[35]  Alex ChiChung Kot,et al.  Domain Generalization with Adversarial Feature Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Mengjie Zhang,et al.  Domain Generalization for Object Recognition with Multi-task Autoencoders , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[38]  Andrea Vedaldi,et al.  Efficient Parametrization of Multi-domain Deep Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[40]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[42]  Bernhard Schölkopf,et al.  Domain Generalization via Invariant Feature Representation , 2013, ICML.

[43]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[44]  Andrea Vedaldi,et al.  Universal representations: The missing link between faces, text, planktons, and cat breeds , 2017, ArXiv.

[45]  Jorge Nocedal,et al.  On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.