Domain-Adversarial Training of Neural Networks

We introduce a new representation learning approach for domain adaptation, in which data at training and test time come from similar but different distributions. Our approach is directly inspired by the theory on domain adaptation suggesting that, for effective domain transfer to be achieved, predictions must be made based on features that cannot discriminate between the training (source) and test (target) domains. The approach implements this idea in the context of neural network architectures that are trained on labeled data from the source domain and unlabeled data from the target domain (no labeled target-domain data is necessary). As the training progresses, the approach promotes the emergence of features that are (i) discriminative for the main learning task on the source domain and (ii) indiscriminate with respect to the shift between the domains. We show that this adaptation behaviour can be achieved in almost any feed-forward model by augmenting it with few standard layers and a new gradient reversal layer. The resulting augmented architecture can be trained using standard backpropagation and stochastic gradient descent, and can thus be implemented with little effort using any of the deep learning packages. We demonstrate the success of our approach for two distinct classification problems (document sentiment analysis and image classification), where state-of-the-art domain adaptation performance on standard benchmarks is achieved. We also validate the approach for descriptor learning task in the context of person re-identification application.

[1]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[2]  Ronitt Rubinfeld,et al.  Testing that distributions are close , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[3]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[4]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[5]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[6]  Hans-Peter Kriegel,et al.  Integrating structured biological data by Kernel Maximum Mean Discrepancy , 2006, ISMB.

[7]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[8]  Hai Tao,et al.  Evaluating Appearance Models for Recognition, Reacquisition, and Tracking , 2007 .

[9]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[10]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[11]  Yishay Mansour,et al.  Multiple Source Adaptation and the Rényi Divergence , 2009, UAI.

[12]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[13]  Yishay Mansour,et al.  Domain Adaptation: Learning Bounds and Algorithms , 2009, COLT.

[14]  Cordelia Schmid,et al.  Multi-view object class detection with a 3D geometric model , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[16]  Michael Goesele,et al.  Back to the Future: Learning Shape Models from 3D CAD Data , 2010, BMVC.

[17]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[18]  Lorenzo Bruzzone,et al.  Domain Adaptation Problems: A DASVM Classification Technique and a Circular Validation Strategy , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Qiang Yang,et al.  Cross Validation Framework to Choose amongst Models and Datasets for Transfer Learning , 2010, ECML/PKDD.

[20]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[21]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Horst Bischof,et al.  Person Re-identification by Descriptive and Discriminative Classification , 2011, SCIA.

[23]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[24]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[25]  Tapani Raiko,et al.  The NIPS workshop on Deep Learning and Unsupervised Feature Learning , 2011 .

[26]  Rama Chellappa,et al.  Domain adaptation for object recognition: An unsupervised approach , 2011, 2011 International Conference on Computer Vision.

[27]  Jürgen Schmidhuber,et al.  Multi-column deep neural network for traffic sign classification , 2012, Neural Networks.

[28]  François Laviolette,et al.  Bayesian Comparison of Machine Learning Algorithms on Single and Multiple Datasets , 2012, AISTATS.

[29]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[31]  Kilian Q. Weinberger,et al.  Marginalized Denoising Autoencoders for Domain Adaptation , 2012, ICML.

[32]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[33]  Alexander Yates,et al.  Biased Representation Learning for Domain Adaptation , 2012, EMNLP.

[34]  Tinne Tuytelaars,et al.  Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[35]  Chunxiao Liu,et al.  POP: Person Re-identification Post-rank Optimisation , 2013, 2013 IEEE International Conference on Computer Vision.

[36]  Xiaogang Wang,et al.  Person Re-identification by Salience Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[37]  Sumit Chopra,et al.  DLID: Deep Learning for Domain Adaptation by Interpolating between Domains , 2013 .

[38]  Brian C. Lovell,et al.  Unsupervised Domain Adaptation by Domain Invariant Projection , 2013, 2013 IEEE International Conference on Computer Vision.

[39]  François Laviolette,et al.  A PAC-Bayesian Approach for Domain Adaptation with Specialization to Linear Classifiers , 2013, ICML.

[40]  Kristen Grauman,et al.  Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation , 2013, ICML.

[41]  Xiaogang Wang,et al.  Locally Aligned Feature Transforms across Views , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Venkatesh Saligrama,et al.  Person Re-identification via Structured Prediction , 2014, ArXiv.

[43]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[44]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[45]  Shaogang Gong,et al.  Person Re-Identification , 2014 .

[46]  Mehryar Mohri,et al.  Domain adaptation and sample bias correction theory and algorithm for regression , 2014, Theor. Comput. Sci..

[47]  Kate Saenko,et al.  From Virtual to Reality: Fast Adaptation of Virtual Object Detectors to Real Domains , 2014, BMVC.

[48]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[49]  David Vázquez,et al.  Virtual and Real World Adaptationfor Pedestrian Detection , 2014, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[51]  Trevor Darrell,et al.  One-Shot Adaptation of Supervised Deep Convolutional Models , 2013, ICLR.

[52]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[53]  Stan Z. Li,et al.  Deep Metric Learning for Practical Person Re-Identification , 2014, ArXiv.

[54]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[55]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  François Laviolette,et al.  Domain-Adversarial Neural Networks , 2014, ArXiv.

[57]  Anton van den Hengel,et al.  Learning to rank in person re-identification with metric ensembles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Ping Li,et al.  Cross-Domain Person Reidentification Using Domain Adaptation Ranking SVMs , 2015, IEEE Transactions on Image Processing.

[59]  David Stutz,et al.  Neural Codes for Image Retrieval , 2015 .

[60]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[61]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[62]  Jian Dong,et al.  Deep domain adaptation for describing people based on fine-grained clothing attributes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Xiaogang Wang,et al.  Person Re-Identification by Saliency Learning , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.