Transporting Causal Mechanisms for Unsupervised Domain Adaptation

Existing Unsupervised Domain Adaptation (UDA) literature adopts the covariate shift and conditional shift assumptions, which essentially encourage models to learn common features across domains. However, due to the lack of supervision in the target domain, they suffer from the semantic loss: the feature will inevitably lose nondiscriminative semantics in source domain, which is however discriminative in target domain. We use a causal view—transportability theory [40]—to identify that such loss is in fact a confounding effect, which can only be removed by causal intervention. However, the theoretical solution provided by transportability is far from practical for UDA, because it requires the stratification and representation of the unobserved confounder that is the cause of the domain gap. To this end, we propose a practical solution: Transporting Causal Mechanisms (TCM), to identify the confounder stratum and representations by using the domain-invariant disentangled causal mechanisms, which are discovered in an unsupervised fashion. Our TCM is both theoretically and empirically grounded. Extensive experiments show that TCM achieves state-of-theart performance on three challenging UDA benchmarks: ImageCLEF-DA, Office-Home, and VisDA-2017. Codes are available at https://github.com/yue-zhongqi/ tcm.

[1]  Sethuraman Panchanathan,et al.  Deep Hashing Network for Unsupervised Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Risi Kondor,et al.  On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups , 2018, ICML.

[3]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[5]  Bernhard Schölkopf,et al.  Learning Independent Causal Mechanisms , 2017, ICML.

[6]  Ivor W. Tsang,et al.  Domain adaptation from multiple sources via auxiliary classifiers , 2009, ICML '09.

[7]  David Pfau,et al.  Towards a Definition of Disentangled Representations , 2018, ArXiv.

[8]  Wei Liu,et al.  Zero-Shot Visual Recognition Using Semantics-Preserving Adversarial Embedding Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Lei Zhang,et al.  Label Propagation with Augmented Anchors: A Simple Semi-Supervised Learning baseline for Unsupervised Domain Adaptation , 2020, ECCV.

[10]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[11]  Z. Geng,et al.  Identifying Causal Effects With Proxy Variables of an Unmeasured Confounder. , 2016, Biometrika.

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Chuan-Xian Ren,et al.  Enhanced Transport Distance for Unsupervised Domain Adaptation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[15]  Mingkui Tan,et al.  Domain-Symmetric Networks for Adversarial Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[17]  Han Zhao,et al.  On Learning Invariant Representations for Domain Adaptation , 2019, ICML.

[18]  Mélanie Frappier,et al.  The Book of Why: The New Science of Cause and Effect , 2018, Science.

[19]  Stefan Bauer,et al.  Robustly Disentangled Causal Mechanisms: Validating Deep Representations for Interventional Robustness , 2018, ICML.

[20]  David J. Kriegman,et al.  Image to Image Translation for Domain Adaptation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[22]  Tatsuya Harada,et al.  Maximum Classifier Discrepancy for Unsupervised Domain Adaptation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  David A. Forsyth,et al.  SafetyNet: Detecting and Rejecting Adversarial Examples Robustly , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Kate Saenko,et al.  VisDA: The Visual Domain Adaptation Challenge , 2017, ArXiv.

[25]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[26]  Matti Lassas,et al.  Globally Injective ReLU Networks , 2020, ArXiv.

[27]  Qingming Huang,et al.  Gradually Vanishing Bridge for Adversarial Domain Adaptation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Tatsuya Harada,et al.  Asymmetric Tri-training for Unsupervised Domain Adaptation , 2017, ICML.

[29]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[30]  Nuno Vasconcelos,et al.  Bidirectional Learning for Domain Adaptation of Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[32]  Stella X. Yu,et al.  Large-Scale Long-Tailed Recognition in an Open World , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Philip S. Yu,et al.  Transfer Feature Learning with Joint Distribution Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[34]  Michael I. Jordan,et al.  Conditional Adversarial Domain Adaptation , 2017, NeurIPS.

[35]  Sunita Sarawagi,et al.  Domain Adaptation of Conditional Probability Models Via Feature Subsetting , 2007, PKDD.

[36]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[37]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[38]  Bernhard Schölkopf,et al.  Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations , 2018, ICML.

[39]  Carlos D. Castillo,et al.  Generate to Adapt: Aligning Domains Using Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Philip S. Yu,et al.  Transfer Joint Matching for Unsupervised Domain Adaptation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Xian-Sheng Hua,et al.  Counterfactual Zero-Shot and Open-Set Visual Recognition , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[43]  Xiangyu Zhang,et al.  Reliable Weighted Optimal Transport for Unsupervised Domain Adaptation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Steffen Bickel,et al.  Discriminative Learning Under Covariate Shift , 2009, J. Mach. Learn. Res..

[45]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[46]  Bernhard Schölkopf,et al.  Domain Adaptation under Target and Conditional Shift , 2013, ICML.

[47]  Vinay P. Namboodiri,et al.  Attending to Discriminative Certainty for Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[49]  Ahmed El-Roby,et al.  Dual Mixup Regularized Learning for Adversarial Domain Adaptation , 2020, ECCV.

[50]  Yi Yang,et al.  Contrastive Adaptation Network for Unsupervised Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Elias Bareinboim,et al.  External Validity: From Do-Calculus to Transportability Across Populations , 2014, Probabilistic and Causal Inference.

[52]  Qingming Huang,et al.  Heuristic Domain Adaptation , 2020, NeurIPS.

[53]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[54]  J. Pearl,et al.  Causal Inference in Statistics: A Primer , 2016 .

[55]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[56]  Klaus-Robert Müller,et al.  Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..

[57]  George Trigeorgis,et al.  Domain Separation Networks , 2016, NIPS.

[58]  Harshad Rai,et al.  Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .

[59]  Jianmin Wang,et al.  Multi-Adversarial Domain Adaptation , 2018, AAAI.

[60]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[61]  Zijian Li,et al.  Learning Disentangled Semantic Representation for Domain Adaptation , 2019, IJCAI.

[62]  Bernhard Schölkopf,et al.  Counterfactuals uncover the modular structure of deep generative models , 2018, ICLR.

[63]  Yuchen Zhang,et al.  Bridging Theory and Algorithm for Domain Adaptation , 2019, ICML.

[64]  Ian J. Wassell,et al.  Re-weighted Adversarial Adaptation Network for Unsupervised Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[65]  Rainer Stiefelhagen,et al.  Adaptiope: A Modern Benchmark for Unsupervised Domain Adaptation , 2021, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).