Improving Unsupervised Domain Adaptation with Representative Selection Techniques

Domain adaptation is a technique that tackles the dataset shift scenario, where the training (source) data and the test (target) data can come from different distributions. Current research works mainly focus on either the covariate shift or the label shift settings, each making a different assumption on how the source and target data are related. Nevertheless, we observe that neither of the settings can perfectly match the needs of a real-world bio-chemistry application. We carefully study the difficulties encountered by those settings on the application and propose a novel method that takes both settings into account to improve the performance on the application. The key idea of our proposed method is to select examples from the source data that are similar to the target distribution of interest. We further explore two selection schemes, the hard-selection scheme that plugs similarity into a nearest-neighbor style approach, and the soft-selection scheme that enforces similarity by soft constraints. Experiments demonstrate that our proposed method not only achieves better accuracy for the bio-chemistry application but also shows promising performance on other domain adaptation tasks when the similarity can be concretely defined.

[1]  Kamyar Azizzadenesheli,et al.  Regularized Learning for Domain Adaptation under Label Shifts , 2019, ICLR.

[2]  Yi Yang,et al.  Contrastive Adaptation Network for Unsupervised Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Christian Wachinger,et al.  Domain adaptation for Alzheimer's disease diagnostics , 2016, NeuroImage.

[4]  Yaoliang Yu,et al.  Analysis of Kernel Mean Matching under Covariate Shift , 2012, ICML.

[5]  Michael I. Jordan,et al.  Deep Transfer Learning with Joint Adaptation Networks , 2016, ICML.

[6]  Jianyang Zeng,et al.  Deep learning with feature embedding for compound-protein interaction prediction , 2016, bioRxiv.

[7]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[8]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[9]  Yifan Wu,et al.  Domain Adaptation with Asymmetrically-Relaxed Distribution Alignment , 2019, ICML.

[10]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[11]  Luc Van Gool,et al.  Domain Adaptive Faster R-CNN for Object Detection in the Wild , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Carlos D. Castillo,et al.  Generate to Adapt: Aligning Domains Using Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Neil D. Lawrence,et al.  Dataset Shift in Machine Learning , 2009 .

[14]  Bernhard Schölkopf,et al.  Domain Adaptation under Target and Conditional Shift , 2013, ICML.

[15]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[16]  Yang Zou,et al.  Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training , 2018, ArXiv.

[17]  Trevor Darrell,et al.  Semi-Supervised Domain Adaptation via Minimax Entropy , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Fabio Pizzati,et al.  Domain Bridge for Unpaired Image-to-Image Translation and Unsupervised Domain Adaptation , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[19]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[20]  Chen Zhang,et al.  Semi-supervised domain adaptation via Fredholm integral based kernel methods , 2019, Pattern Recognit..

[21]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[22]  Alexander J. Smola,et al.  Detecting and Correcting for Label Shift with Black Box Predictors , 2018, ICML.

[23]  M. Kawanabe,et al.  Direct importance estimation for covariate shift adaptation , 2008 .

[24]  Heikki Huttunen,et al.  MRI based dementia classification using semi-supervised learning and domain adaptation , 2014 .

[25]  Michael I. Jordan,et al.  Conditional Adversarial Domain Adaptation , 2017, NeurIPS.

[26]  Michael J. Keiser,et al.  Predicting new molecular targets for known drugs , 2009, Nature.