KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via Knowledge Distillation

Conventional unsupervised multi-source domain adaptation (UMDA) methods assume all source domains can be accessed directly. This neglects the privacy-preserving policy, that is, all the data and computations must be kept decentralized. There exists three problems in this scenario: (1) Minimizing the domain distance requires the pairwise calculation of the data from source and target domains, which is not accessible. (2) The communication cost and privacy security limit the application of UMDA methods (e.g., the domain adversarial training). (3) Since users have no authority to check the data quality, the irrelevant or malicious source domains are more likely to appear, which causes negative transfer. In this study, we propose a privacy-preserving UMDA paradigm named Knowledge Distillation based Decentralized Domain Adaptation (KD3A), which performs domain adaptation through the knowledge distillation on models from different source domains. KD3A solves the above problems with three components: (1) A multi-source knowledge distillation method named Knowledge Vote to learn high-quality domain consensus knowledge. (2) A dynamic weighting strategy named Consensus Focus to identify both the malicious and irrelevant domains. (3) A decentralized optimization strategy for domain distance named BatchNorm MMD. The extensive experiments on DomainNet demonstrate that KD3A is robust to the negative transfer and brings a 100x reduction of communication cost compared with other decentralized UMDA methods. Moreover, our KD3A significantly outperforms state-of-the-art UMDA approaches.

[1]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[2]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[3]  Han Zhao,et al.  On Learning Invariant Representations for Domain Adaptation , 2019, ICML.

[4]  Tao Xiang,et al.  Domain Adaptive Ensemble Learning , 2020, IEEE Transactions on Image Processing.

[5]  José M. F. Moura,et al.  Adversarial Multiple Source Domain Adaptation , 2018, NeurIPS.

[6]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[7]  Bohyung Han,et al.  Domain-Specific Batch Normalization for Unsupervised Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Cheng Wu,et al.  A Graph Embedding Framework for Maximum Mean Discrepancy-Based Domain Adaptation Algorithms , 2020, IEEE Transactions on Image Processing.

[9]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[10]  Yang Song,et al.  Beyond Inferring Class Representatives: User-Level Privacy Leakage From Federated Learning , 2018, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[11]  Tat-Seng Chua,et al.  CauseRec: Counterfactual User Sequence Synthesis for Sequential Recommendation , 2021, SIGIR.

[12]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[13]  Qilong Wang,et al.  Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Bo Wang,et al.  Moment Matching for Multi-Source Domain Adaptation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Michael I. Jordan,et al.  Conditional Adversarial Domain Adaptation , 2017, NeurIPS.

[16]  Danny Z. Chen,et al.  Electrocardio Panorama: Synthesizing New ECG views with Self-supervision , 2021, IJCAI.

[17]  Regina Barzilay,et al.  Multi-Source Domain Adaptation with Mixture of Experts , 2018, EMNLP.

[18]  Hau-San Wong,et al.  Model Adaptation: Unsupervised Domain Adaptation Without Source Data , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Vitaly Shmatikov,et al.  How To Backdoor Federated Learning , 2018, AISTATS.

[20]  José M. F. Moura,et al.  Multiple Source Domain Adaptation with Adversarial Learning , 2018, ICLR.

[21]  Jiashi Feng,et al.  Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation , 2020, ICML.

[22]  Biing-Hwang Juang,et al.  Adversarial Teacher-Student Learning for Unsupervised Domain Adaptation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  Huchuan Lu,et al.  Deep Mutual Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[25]  Liang Lin,et al.  Deep Cocktail Network: Multi-source Unsupervised Domain Adaptation with Category Shift , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Bernhard Schölkopf,et al.  Multi-Source Domain Adaptation: A Causal View , 2015, AAAI.

[27]  Chun Chen,et al.  Online Knowledge Distillation with Diverse Peers , 2019, AAAI.

[28]  Kate Saenko,et al.  Federated Adversarial Domain Adaptation , 2020, ICLR.

[29]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[30]  Song Han,et al.  Deep Leakage from Gradients , 2019, NeurIPS.

[31]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[32]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[33]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Tatsuya Harada,et al.  Maximum Classifier Discrepancy for Unsupervised Domain Adaptation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Quoc V. Le,et al.  Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  John G. Breslin,et al.  Knowledge Adaptation: Teaching to Adapt , 2017, ArXiv.

[37]  Kurt Keutzer,et al.  Multi-source Distilling Domain Adaptation , 2020, AAAI.

[38]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[39]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[40]  Ser-Nam Lim,et al.  Curriculum Manager for Source Selection in Multi-Source Domain Adaptation , 2020, ECCV.