A Unified Joint Maximum Mean Discrepancy for Domain Adaptation

Domain adaptation has received a lot of attention in recent years, and many algorithms have been proposed with impressive progress. However, it is still not fully explored concerning the joint probability distribution (P (X,Y)) distance for this problem, since its empirical estimation derived from the maximum mean discrepancy (joint maximum mean discrepancy, JMMD) will involve complex tensor-product operator that is hard to manipulate. To solve this issue, this paper theoretically derives a unified form of JMMD that is easy to optimize, and proves that the marginal, class conditional and weighted class conditional probability distribution distances are our special cases with different label kernels, among which the weighted class conditional one not only can realize feature alignment across domains in the category level, but also deal with imbalance dataset using the class prior probabilities. From the revealed unified JMMD, we illustrate that JMMD degrades the feature-label dependence (discriminability) that benefits to classification, and it is sensitive to the label distribution shift when the label kernel is the weighted class conditional one. Therefore, we leverage Hilbert Schmidt independence criterion and propose a novel MMD matrix to promote the dependence, and devise a novel label kernel that is robust to label distribution shift. Finally, we conduct extensive experiments on several cross-domain datasets to demonstrate the validity and effectiveness of the revealed theoretical results.

[1]  Cheng Wu,et al.  Domain Invariant and Class Discriminative Feature Learning for Visual Domain Adaptation , 2018, IEEE Transactions on Image Processing.

[2]  Hongfu Liu,et al.  Mining Label Distribution Drift in Unsupervised Domain Adaptation , 2020, ArXiv.

[3]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[4]  Ali Ghodsi,et al.  Minimizing the Discrepancy Between Source and Target Domains by Learning Adapting Components , 2014, Journal of Computer Science and Technology.

[5]  Nicolas Courty,et al.  Wasserstein discriminant analysis , 2016, Machine Learning.

[6]  Qilong Wang,et al.  Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  David Zhang,et al.  Learning Domain-Invariant Subspace Using Domain Features and Independence Maximization , 2016, IEEE Transactions on Cybernetics.

[8]  Stella X. Yu,et al.  Open Compound Domain Adaptation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  R. Venkatesh Babu,et al.  Universal Source-Free Domain Adaptation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Wittawat Jitkrittum,et al.  K2-ABC: Approximate Bayesian Computation with Kernel Embeddings , 2015, AISTATS.

[11]  Jianmin Wang,et al.  Transferability vs. Discriminability: Batch Spectral Penalization for Adversarial Domain Adaptation , 2019, ICML.

[12]  Jing Zhang,et al.  Joint Geometrical and Statistical Alignment for Visual Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Philip S. Yu,et al.  Adaptation Regularization: A General Framework for Transfer Learning , 2014, IEEE Transactions on Knowledge and Data Engineering.

[14]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[15]  Ali Ghodsi,et al.  Adapting Component Analysis , 2012, 2012 IEEE 12th International Conference on Data Mining.

[16]  Nicolas Courty,et al.  Optimal Transport for Domain Adaptation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Xiangyu Zhang,et al.  Reliable Weighted Optimal Transport for Unsupervised Domain Adaptation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  C. Baker Joint measures and cross-covariance operators , 1973 .

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  Mengjie Zhang,et al.  Domain Adaptive Neural Networks for Object Recognition , 2014, PRICAI.

[21]  Ke Lu,et al.  Transfer Independently Together: A Generalized Framework for Domain Adaptation , 2019, IEEE Transactions on Cybernetics.

[22]  Philip S. Yu,et al.  Visual Domain Adaptation with Manifold Embedded Distribution Alignment , 2018, ACM Multimedia.

[23]  Jindong Wang,et al.  Easy Transfer Learning By Exploiting Intra-Domain Structures , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[24]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[25]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[26]  Toby P. Breckon,et al.  Unsupervised Domain Adaptation via Structured Prediction Based Selective Pseudo-Labeling , 2019, AAAI.

[27]  Qingming Huang,et al.  Gradually Vanishing Bridge for Adversarial Domain Adaptation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Kenji Fukumizu,et al.  A Linear-Time Kernel Goodness-of-Fit Test , 2017, NIPS.

[29]  Dacheng Tao,et al.  Bregman Divergence-Based Regularization for Transfer Subspace Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[30]  Chuan-Xian Ren,et al.  Enhanced Transport Distance for Unsupervised Domain Adaptation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Zhu Lei,et al.  Locality Preserving Joint Transfer for Domain Adaptation , 2019, IEEE Transactions on Image Processing.

[32]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[33]  Feiping Nie,et al.  Semi-supervised orthogonal discriminant analysis via label propagation , 2009, Pattern Recognit..

[34]  Yiqiang Chen,et al.  Balanced Distribution Adaptation for Transfer Learning , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[35]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Le Song,et al.  Kernel Belief Propagation , 2011, AISTATS.

[37]  Koby Crammer,et al.  Learning Bounds for Domain Adaptation , 2007, NIPS.

[38]  Ivor W. Tsang,et al.  Domain Transfer Multiple Kernel Learning , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Stefano Ermon,et al.  InfoVAE: Balancing Learning and Inference in Variational Autoencoders , 2019, AAAI.

[41]  Yi Yang,et al.  Contrastive Adaptation Network for Unsupervised Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Bob Zhang,et al.  Class-Specific Reconstruction Transfer Learning for Visual Recognition Across Domains , 2019, IEEE Transactions on Image Processing.

[43]  Jafar Tahmoresnezhad,et al.  Visual domain adaptation via transfer feature learning , 2017, Knowledge and Information Systems.

[44]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[45]  Nicolas Courty,et al.  Joint distribution optimal transportation for domain adaptation , 2017, NIPS.

[46]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[47]  Philip S. Yu,et al.  Transfer Feature Learning with Joint Distribution Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[48]  Philip S. Yu,et al.  Transfer Joint Matching for Unsupervised Domain Adaptation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Yiqiang Chen,et al.  Transfer Learning with Dynamic Distribution Adaptation , 2019, ACM Trans. Intell. Syst. Technol..

[50]  Ming-Hsuan Yang,et al.  Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition From a Domain Adaptation Perspective , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Zongben Xu,et al.  Spherical Space Domain Adaptation With Robust Pseudo-Label Loss , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Feiping Nie,et al.  A general graph-based semi-supervised learning with novel class discovery , 2010, Neural Computing and Applications.

[53]  Nicolas Courty,et al.  DeepJDOT: Deep Joint distribution optimal transport for unsupervised domain adaptation , 2018, ECCV.

[54]  Michael I. Jordan,et al.  Deep Transfer Learning with Joint Adaptation Networks , 2016, ICML.

[55]  Haojie Li,et al.  Adaptive Local Neighbors for Transfer Discriminative Feature Learning , 2020, ECAI.