Transferable Representation Learning with Deep Adaptation Networks

Domain adaptation studies learning algorithms that generalize across source domains and target domains that exhibit different distributions. Recent studies reveal that deep neural networks can learn transferable features that generalize well to similar novel tasks. However, as deep features eventually transition from general to specific along the network, feature transferability drops significantly in higher task-specific layers with increasing domain discrepancy. To formally reduce the effects of this discrepancy and enhance feature transferability in task-specific layers, we develop a novel framework for deep adaptation networks that extends deep convolutional neural networks to domain adaptation problems. The framework embeds the deep features of all task-specific layers into reproducing kernel Hilbert spaces (RKHSs) and optimally matches different domain distributions. The deep features are made more transferable by exploiting low-density separation of target-unlabeled data in very deep architectures, while the domain discrepancy is further reduced via the use of multiple kernel learning that enhances the statistical power of kernel embedding matching. The overall framework is cast in a minimax game setting. Extensive empirical evidence shows that the proposed networks yield state-of-the-art results on standard visual domain-adaptation benchmarks.

[1]  Yoshua Bengio,et al.  Semi-supervised Learning by Entropy Minimization , 2004, CAP.

[2]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[3]  Klaus-Robert Müller,et al.  Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..

[4]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[5]  Bernhard Schölkopf,et al.  Kernel Choice and Classifiability for RKHS Embeddings of Probability Distributions , 2009, NIPS.

[6]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[7]  Yishay Mansour,et al.  Domain Adaptation: Learning Bounds and Algorithms , 2009, COLT.

[8]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[9]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[10]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[11]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[12]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[13]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[14]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[15]  Rama Chellappa,et al.  Domain adaptation for object recognition: An unsupervised approach , 2011, 2011 International Conference on Computer Vision.

[16]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[17]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[18]  Ivor W. Tsang,et al.  Domain Transfer Multiple Kernel Learning , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Sivaraman Balakrishnan,et al.  Optimal kernel choice for large-scale two-sample tests , 2012, NIPS.

[20]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[22]  Kenji Fukumizu,et al.  Equivalence of distance-based and RKHS-based statistics in hypothesis testing , 2012, ArXiv.

[23]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Tinne Tuytelaars,et al.  Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[25]  Bernhard Schölkopf,et al.  Domain Adaptation under Target and Conditional Shift , 2013, ICML.

[26]  Philip S. Yu,et al.  Transfer Feature Learning with Joint Distribution Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[27]  Kristen Grauman,et al.  Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation , 2013, ICML.

[28]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[29]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[30]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Trevor Darrell,et al.  LSDA: Large Scale Detection through Adaptation , 2014, NIPS.

[32]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[33]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[34]  Jeff G. Schneider,et al.  Flexible Transfer Learning under Support and Model Shift , 2014, NIPS.

[35]  Ivor W. Tsang,et al.  Learning With Augmented Features for Supervised and Semi-Supervised Heterogeneous Domain Adaptation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[37]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  François Laviolette,et al.  Domain-Adversarial Neural Networks , 2014, ArXiv.

[39]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[40]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[41]  Arthur Gretton,et al.  Fast Two-Sample Testing with Analytic Representations of Probability Measures , 2015, NIPS.

[42]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[44]  Philip S. Yu,et al.  Domain Invariant Transfer Kernel Learning , 2015, IEEE Transactions on Knowledge and Data Engineering.

[45]  Trevor Darrell,et al.  Simultaneous Deep Transfer Across Domains and Tasks , 2015, ICCV.

[46]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[47]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Ming-Yu Liu,et al.  Coupled Generative Adversarial Networks , 2016, NIPS.

[49]  Arthur Gretton,et al.  Interpretable Distribution Features with Maximum Testing Power , 2016, NIPS.

[50]  Kate Saenko,et al.  Return of Frustratingly Easy Domain Adaptation , 2015, AAAI.

[51]  Dumitru Erhan,et al.  Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Tomas Pfister,et al.  Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[55]  Michael I. Jordan,et al.  Deep Transfer Learning with Joint Adaptation Networks , 2016, ICML.

[56]  Fei-Fei Li,et al.  Label Efficient Learning of Transferable Representations acrosss Domains and Tasks , 2017, NIPS.