论文信息 - On Learning Invariant Representation for Domain Adaptation

On Learning Invariant Representation for Domain Adaptation

Due to the ability of deep neural nets to learn rich representations, recent advances in unsupervised domain adaptation have focused on learning domain-invariant features that achieve a small error on the source domain. The hope is that the learnt representation, together with the hypothesis learnt from the source domain, can generalize to the target domain. In this paper, we first construct a simple counterexample showing that, contrary to common belief, the above conditions are not sufficient to guarantee successful domain adaptation. In particular, the counterexample exhibits \emph{conditional shift}: the class-conditional distributions of input features change between source and target domains. To give a sufficient condition for domain adaptation, we propose a natural and interpretable generalization upper bound that explicitly takes into account the aforementioned shift. Moreover, we shed new light on the problem by proving an information-theoretic lower bound on the joint error of \emph{any} domain adaptation method that attempts to learn invariant representations. Our result characterizes a fundamental tradeoff between learning invariant representations and achieving small joint error on both domains when the marginal label distributions differ from source to target. Finally, we conduct experiments on real-world datasets that corroborate our theoretical findings. We believe these insights are helpful in guiding the future design of domain adaptation and representation learning algorithms.

[1] Jianhua Lin,et al. Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[2] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[3] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[4] Dominik Endres,et al. A new metric for probability distributions , 2003, IEEE Transactions on Information Theory.

[5] Shai Ben-David,et al. Detecting Change in Data Streams , 2004, VLDB.

[6] Koby Crammer,et al. Analysis of Representations for Domain Adaptation , 2006, NIPS.

[7] Koby Crammer,et al. Learning Bounds for Domain Adaptation , 2007, NIPS.

[8] Mehryar Mohri,et al. Sample Selection Bias Correction Theory , 2008, ALT.

[9] Yishay Mansour,et al. Multiple Source Adaptation and the Rényi Divergence , 2009, UAI.

[10] Koby Crammer,et al. A theory of learning from different domains , 2010, Machine Learning.

[11] Yishay Mansour,et al. Domain Adaptation: Learning Bounds and Algorithms , 2009, COLT.

[12] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[13] Yoshua Bengio,et al. Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[14] Pascal Fua,et al. Non-Linear Domain Adaptation with Boosting , 2013, NIPS.

[15] Bernhard Schölkopf,et al. Domain Adaptation under Target and Conditional Shift , 2013, ICML.

[16] Yishay Mansour,et al. Robust domain adaptation , 2013, Annals of Mathematics and Artificial Intelligence.

[17] Mehryar Mohri,et al. Domain adaptation and sample bias correction theory and algorithm for regression , 2014, Theor. Comput. Sci..

[18] Philip S. Yu,et al. Transfer Joint Matching for Unsupervised Domain Adaptation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19] François Laviolette,et al. Domain-Adversarial Neural Networks , 2014, ArXiv.

[20] Michael I. Jordan,et al. Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[21] George Trigeorgis,et al. Domain Separation Networks , 2016, NIPS.

[22] François Laviolette,et al. Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[23] Trevor Darrell,et al. FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation , 2016, ArXiv.

[24] Bernhard Schölkopf,et al. Domain Adaptation with Conditional Transferable Components , 2016, ICML.

[25] Michael I. Jordan,et al. Unsupervised Domain Adaptation with Residual Transfer Networks , 2016, NIPS.

[26] Trevor Darrell,et al. Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Regina Barzilay,et al. Aspect-augmented Adversarial Networks for Domain Adaptation , 2017, TACL.

[28] Tomas Pfister,et al. Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Nicolas Courty,et al. Optimal Transport for Domain Adaptation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30] Ralph Grishman,et al. Domain Adaptation for Relation Extraction with Domain Adversarial Neural Network , 2017, IJCNLP.

[31] Nicolas Courty,et al. Joint distribution optimal transportation for domain adaptation , 2017, NIPS.

[32] Han Zhao,et al. Principled Hybrids of Generative and Discriminative Domain Adaptation , 2017, ArXiv.

[33] Han Zhao,et al. Unsupervised Domain Adaptation with a Relaxed Covariate Shift Assumption , 2017, AAAI.

[34] Richard Socher,et al. Augmented Cyclic Adversarial Learning for Domain Adaptation , 2018, ArXiv.

[35] Alexander J. Smola,et al. Detecting and Correcting for Label Shift with Black Box Predictors , 2018, ICML.

[36] Taesung Park,et al. CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[37] Jian Shen,et al. Wasserstein Distance Guided Representation Learning for Domain Adaptation , 2017, AAAI.

[38] José M. F. Moura,et al. Adversarial Multiple Source Domain Adaptation , 2018, NeurIPS.