Feature Space Independent Semi-Supervised Domain Adaptation via Kernel Matching

Domain adaptation methods aim to learn a good prediction model in a label-scarce target domain by leveraging labeled patterns from a related source domain where there is a large amount of labeled data. However, in many practical domain adaptation learning scenarios, the feature distribution in the source domain is different from that in the target domain. In the extreme, the two distributions could differ completely when the feature representation of the source domain is totally different from that of the target domain. To address the problems of substantial feature distribution divergence across domains and heterogeneous feature representations of different domains, we propose a novel feature space independent semi-supervised kernel matching method for domain adaptation in this work. Our approach learns a prediction function on the labeled source data while mapping the target data points to similar source data points by matching the target kernel matrix to a submatrix of the source kernel matrix based on a Hilbert Schmidt Independence Criterion. We formulate this simultaneous learning and mapping process as a non-convex integer optimization problem and present a local minimization procedure for its relaxed continuous form. We evaluate the proposed kernel matching method using both cross domain sentiment classification tasks of Amazon product reviews and cross language text classification tasks of Reuters multilingual newswire stories. Our empirical results demonstrate that the proposed kernel matching method consistently and significantly outperforms comparison methods on both cross domain classification problems with homogeneous feature spaces and cross domain classification problems with heterogeneous feature spaces.

[1]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[2]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[3]  Ivor W. Tsang,et al.  Learning with Augmented Features for Heterogeneous Domain Adaptation , 2012, ICML.

[4]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[5]  Avishek Saha,et al.  Co-regularization Based Semi-supervised Domain Adaptation , 2010, NIPS.

[6]  Le Song,et al.  Supervised feature selection via dependence estimation , 2007, ICML '07.

[7]  Ivor W. Tsang,et al.  Domain adaptation from multiple sources via auxiliary classifiers , 2009, ICML '09.

[8]  Philip S. Yu,et al.  Transfer Learning on Heterogenous Feature Spaces via Spectral Transformation , 2010, 2010 IEEE International Conference on Data Mining.

[9]  Qiang Yang,et al.  Adaptive Localization in a Dynamic WiFi Environment through Multi-view Learning , 2007, AAAI.

[10]  Chang Wang,et al.  Heterogeneous Domain Adaptation Using Manifold Alignment , 2011, IJCAI.

[11]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[12]  Daumé,et al.  Domain Adaptation meets Active Learning , 2010, HLT-NAACL 2010.

[13]  Qiang Yang,et al.  Transfer Learning by Structural Analogy , 2011, AAAI.

[14]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[15]  Le Song,et al.  Kernelized Sorting , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Trevor Darrell,et al.  Semi-supervised Domain Adaptation with Instance Constraints , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[18]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[19]  Kristen Grauman,et al.  Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation , 2013, ICML.

[20]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[21]  Trevor Darrell,et al.  What you saw is not what you get: Domain adaptation using asymmetric kernel transforms , 2011, CVPR 2011.

[22]  Mikhail Belkin,et al.  Using manifold structure for partially labelled classification , 2002, NIPS 2002.

[23]  John Blitzer,et al.  Co-Training for Domain Adaptation , 2011, NIPS.

[24]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Classification , 2011, AAAI.

[25]  Qiang Yang,et al.  Translated Learning: Transfer Learning across Different Feature Spaces , 2008, NIPS.

[26]  Mohammad Rastegari,et al.  Domain Adaptive Classification , 2013, 2013 IEEE International Conference on Computer Vision.

[27]  Hal Daumé,et al.  Kernelized Sorting for Natural Language Processing , 2010, AAAI.

[28]  John Blitzer,et al.  Domain Adaptation with Coupled Subspaces , 2011, AISTATS.

[29]  Qiang Yang,et al.  Transferring Multi-device Localization Models using Latent Multi-task Learning , 2008, AAAI.

[30]  Trevor Darrell,et al.  Efficient Learning of Domain-invariant Image Representations , 2013, ICLR.

[31]  L. Rosasco,et al.  Manifold Regularization , 2007 .

[32]  Motoaki Kawanabe,et al.  Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation , 2007, NIPS.

[33]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[34]  Yishay Mansour,et al.  Domain Adaptation with Multiple Sources , 2008, NIPS.

[35]  Kristian Kersting,et al.  Beyond 2D-grids: a dependence maximization view on image browsing , 2010, MIR '10.

[36]  Massih-Reza Amini,et al.  Learning from Multiple Partially Observed Views - an Application to Multilingual Text Categorization , 2009, NIPS.

[37]  ChengXiang Zhai,et al.  A two-stage approach to domain adaptation for statistical classifiers , 2007, CIKM '07.

[38]  Benno Stein,et al.  Cross-Language Text Classification Using Structural Correspondence Learning , 2010, ACL.

[39]  Songbo Tan,et al.  Improving SCL Model for Sentiment-Transfer Learning , 2009, HLT-NAACL.

[40]  Gokhan Tur,et al.  Co-adaptation: Adaptive co-training for semi-supervised learning , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.