Domain Conditional Predictors for Domain Adaptation

Learning guarantees often rely on assumptions of i.i.d. data, which will likely be violated in practice once predictors are deployed to perform real-world tasks. Domain adaptation approaches thus appeared as a useful framework yielding extra flexibility in that distinct train and test data distributions are supported, provided that other assumptions are satisfied such as covariate shift, which expects the conditional distributions over labels to be independent of the underlying data distribution. Several approaches were introduced in order to induce generalization across varying train and test data sources, and those often rely on the general idea of domain-invariance, in such a way that the data-generating distributions are to be disregarded by the prediction model. In this contribution, we tackle the problem of generalizing across data sources by taking the opposite direction. We consider a conditional modeling approach in which predictions, in addition of being dependent of the input data, use information relative to the underlying data-generating distribution. For instance, the model has an explicit mechanism to adapt to changing environments and/or new data sources. We argue that such an approach is more general than current domain adaptation methods since it does not require extra assumptions such as covariate shift and further yields simpler training algorithms that avoid a common source of training instabilities caused by minimax formulations, often employed in domain-invariant methods.

[1]  J. Langford Tutorial on Practical Prediction Theory for Classification , 2005, J. Mach. Learn. Res..

[2]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[3]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[4]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[5]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[6]  Antonio Torralba,et al.  Exploiting hierarchical context on a large database of object categories , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Ye Xu,et al.  Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias , 2013, 2013 IEEE International Conference on Computer Vision.

[9]  Sanjeev Khudanpur,et al.  A time delay neural network architecture for efficient modeling of long temporal contexts , 2015, INTERSPEECH.

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[12]  Geoffrey E. Hinton,et al.  Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.

[13]  Yongxin Yang,et al.  Deeper, Broader and Artier Domain Generalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Sethuraman Panchanathan,et al.  Deep Hashing Network for Unsupervised Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Serge J. Belongie,et al.  Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Tatsuya Harada,et al.  Maximum Classifier Discrepancy for Unsupervised Domain Adaptation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Liang Lin,et al.  Deep Cocktail Network: Multi-source Unsupervised Domain Adaptation with Category Shift , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[19]  Luis Herranz,et al.  Cross-Modulation Networks for Few-Shot Learning , 2018, ArXiv.

[20]  Aaron C. Courville,et al.  FiLM: Visual Reasoning with a General Conditioning Layer , 2017, AAAI.

[21]  Alexandre Lacoste,et al.  TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.

[22]  Katja Hofmann,et al.  Fast Context Adaptation via Meta-Learning , 2018, ICML.

[23]  Bo Wang,et al.  Moment Matching for Multi-Source Domain Adaptation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  David Lopez-Paz,et al.  Invariant Risk Minimization , 2019, ArXiv.

[25]  Lav R. Varshney,et al.  CTRL: A Conditional Transformer Language Model for Controllable Generation , 2019, ArXiv.

[26]  Ioannis Mitliagkas,et al.  Generalizing to unseen domains via distribution matching , 2019, 1911.00804.

[27]  Han Zhao,et al.  On Learning Invariant Representations for Domain Adaptation , 2019, ICML.

[28]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Sebastian Nowozin,et al.  Fast and Flexible Multi-Task Classification Using Conditional Neural Adaptive Processes , 2019, NeurIPS.

[30]  Patrick Kenny,et al.  Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-end Speaker Verification , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[31]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[32]  Kush R. Varshney,et al.  Invariant Risk Minimization Games , 2020, ICML.

[33]  Ming-Hsuan Yang,et al.  Cross-Domain Few-Shot Classification via Learned Feature-Wise Transformation , 2020, ICLR.

[34]  I. Rish,et al.  Adversarial Feature Desensitization , 2020, NeurIPS.

[35]  Tongliang Liu,et al.  Domain Generalization via Entropy Regularization , 2020, NeurIPS.

[36]  Pradeep Ravikumar,et al.  The Risks of Invariant Risk Minimization , 2020, ICLR.

[37]  Aaron C. Courville,et al.  Out-of-Distribution Generalization via Risk Extrapolation (REx) , 2020, ICML.

[38]  David Lopez-Paz,et al.  In Search of Lost Domain Generalization , 2020, ICLR.

[39]  Noam Shazeer,et al.  Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity , 2021, ArXiv.