LAMDA: Label Matching Deep Domain Adaptation

Deep domain adaptation (DDA) approaches have recently been shown to perform better than their shallow rivals with better modeling capacity on complex domains (e.g., image, structural data, and sequential data). The underlying idea is to learn domain invariant representations on a latent space that can bridge the gap between source and target domains. Several theoretical studies have established insightful understanding and the benefit of learning domain invariant features; however, they are usually limited to the case where there is no label shift, hence hindering its applicability. In this paper, we propose and study a new challenging setting that allows us to use a Wasserstein distance (WS) to not only quantify the data shift but also to define the label shift directly. We further develop a theory to demonstrate that minimizing the WS of the data shift leads to closing the gap between the source and target data distributions on the latent space (e.g., an intermediate layer of a deep net), while still being able to quantify the label shift with respect to this latent space. Interestingly, our theory can consequently explain certain drawbacks of learning domain invariant features on the latent space. Finally, grounded on the results and guidance of our developed theory, we propose the Label Matching Deep Domain Adaptation (LAMDA) approach that outperforms baselines on real-world datasets for DA problems.

[1]  Chen-Yu Lee,et al.  Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Ruth Urner,et al.  Domain adaptation–can quantity compensate for quality? , 2013, Annals of Mathematics and Artificial Intelligence.

[3]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[4]  Michael I. Jordan,et al.  Conditional Adversarial Domain Adaptation , 2017, NeurIPS.

[5]  Vinay P. Namboodiri,et al.  Attending to Discriminative Certainty for Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Venkatesh Saligrama,et al.  Learning Classifiers for Target Domain with Limited or No Labels , 2019, ICML.

[7]  Stefano Ermon,et al.  A DIRT-T Approach to Unsupervised Domain Adaptation , 2018, ICLR.

[8]  Mengjie Zhang,et al.  Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation , 2016, ECCV.

[9]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[10]  Dumitru Erhan,et al.  Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Sebastian Nowozin,et al.  f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization , 2016, NIPS.

[12]  Chuan-Xian Ren,et al.  Enhanced Transport Distance for Unsupervised Domain Adaptation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Kui Jia,et al.  Discriminative Adversarial Domain Adaptation , 2019, AAAI.

[14]  Shai Ben-David,et al.  On the Hardness of Domain Adaptation and the Utility of Unlabeled Target Samples , 2012, ALT.

[15]  George Trigeorgis,et al.  Domain Separation Networks , 2016, NIPS.

[16]  Mohammad Havaei,et al.  Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation , 2020, ICML.

[17]  Junzhou Huang,et al.  Progressive Feature Alignment for Unsupervised Domain Adaptation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Han Zhao,et al.  On Learning Invariant Representations for Domain Adaptation , 2019, ICML.

[19]  François Laviolette,et al.  A new PAC-Bayesian perspective on domain adaptation , 2015, ICML 2016.

[20]  Tatsuya Harada,et al.  Maximum Classifier Discrepancy for Unsupervised Domain Adaptation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Nicolas Courty,et al.  DeepJDOT: Deep Joint distribution optimal transport for unsupervised domain adaptation , 2018, ECCV.

[22]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[23]  Kate Saenko,et al.  Deep CORAL: Correlation Alignment for Deep Domain Adaptation , 2016, ECCV Workshops.

[24]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[25]  Chuan Chen,et al.  Learning Semantic Representations for Unsupervised Domain Adaptation , 2018, ICML.

[26]  Bernhard Schölkopf,et al.  Wasserstein Auto-Encoders , 2017, ICLR.

[27]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[28]  Tatsuya Harada,et al.  Asymmetric Tri-training for Unsupervised Domain Adaptation , 2017, ICML.

[29]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[30]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[31]  C. Villani Optimal Transport: Old and New , 2008 .

[32]  Michael I. Jordan,et al.  Transferable Normalization: Towards Improving Transferability of Deep Neural Networks , 2019, NeurIPS.

[33]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[34]  Scott Sanner,et al.  Algorithms for Direct 0-1 Loss Optimization in Binary Classification , 2013, ICML.

[35]  Nicolas Courty,et al.  Joint distribution optimal transportation for domain adaptation , 2017, NIPS.

[36]  Rajesh Ranganath,et al.  Support and Invertibility in Domain-Invariant Representations , 2019, AISTATS.

[37]  Silvio Savarese,et al.  Learning Transferrable Representations for Unsupervised Domain Adaptation , 2016, NIPS.

[38]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Trevor Darrell,et al.  Simultaneous Deep Transfer Across Domains and Tasks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[40]  Vladimir Pavlovic,et al.  Unsupervised Visual Domain Adaptation: A Deep Max-Margin Gaussian Process Approach , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Michael I. Jordan,et al.  Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation , 2019, ICML.

[42]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[43]  Ievgen Redko,et al.  Theoretical Analysis of Domain Adaptation with Optimal Transport , 2016, ECML/PKDD.

[44]  Mingkui Tan,et al.  Domain-Symmetric Networks for Adversarial Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Alison L Gibbs,et al.  On Choosing and Bounding Probability Metrics , 2002, math/0209021.

[46]  Quan Hung Tran,et al.  Most: multi-source domain adaptation via optimal transport for student-teacher learning , 2021, UAI.

[47]  Carlos D. Castillo,et al.  Generate to Adapt: Aligning Domains Using Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48]  François Laviolette,et al.  A PAC-Bayesian Approach for Domain Adaptation with Specialization to Linear Classifiers , 2013, ICML.

[49]  Filippo Santambrogio,et al.  Optimal Transport for Applied Mathematicians , 2015 .

[50]  Hongyuan Zha,et al.  On Scalable and Efficient Computation of Large Scale Optimal Transport , 2019, ICML.

[51]  Yuchen Zhang,et al.  Bridging Theory and Algorithm for Domain Adaptation , 2019, ICML.

[52]  Michael I. Jordan,et al.  Unsupervised Domain Adaptation with Residual Transfer Networks , 2016, NIPS.

[53]  Dominik Endres,et al.  A new metric for probability distributions , 2003, IEEE Transactions on Information Theory.

[54]  Geoffrey French,et al.  Self-ensembling for visual domain adaptation , 2017, ICLR.

[55]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Lizhen Qu,et al.  Deep Domain Adaptation for Vulnerable Code Function Identification , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[57]  Yishay Mansour,et al.  Domain Adaptation with Multiple Sources , 2008, NIPS.

[58]  Chong-Wah Ngo,et al.  Transferrable Prototypical Networks for Unsupervised Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[60]  Mehryar Mohri,et al.  Adaptation Based on Generalized Discrepancy , 2019, J. Mach. Learn. Res..

[61]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[62]  Jiashi Feng,et al.  Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation , 2020, ICML.

[63]  Jun Zhu,et al.  Cluster Alignment With a Teacher for Unsupervised Domain Adaptation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[64]  Dong Xu,et al.  Collaborative and Adversarial Network for Unsupervised Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[65]  Quan Hung Tran,et al.  TIDOT: A Teacher Imitation Learning Approach for Domain Adaptation with Optimal Transport , 2021, IJCAI.

[66]  W. Marsden I and J , 2012 .

[67]  Xiangyu Zhang,et al.  Reliable Weighted Optimal Transport for Unsupervised Domain Adaptation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[69]  Dinh Q. Phung,et al.  Dual-Component Deep Domain Adaptation: A New Approach for Cross Project Software Vulnerability Detection , 2020, PAKDD.