Discriminative Transfer Learning on Manifold

Collective matrix factorization has achieved a remarkable success in document classification in the literature of transfer learning. However, the learned latent factors still suffer from the divergence between different domains and thus are usually not discriminative for an appropriate assignment of category labels. Based on these observations, we impose a discriminative regression model over the latent factors to enhance the capability of label prediction. Moreover, we propose to minimize the Maximum Mean Discrepancy in the latent manifold subspace, as opposed to typically in the original data space, to bridge the gap between different domains. Specifically, we formulate these objectives into a joint optimization framework with two matrix tri-factorizations for the source and target domains simultaneously. An iterative algorithm DTLM is developed and the theoretical analysis of its convergence is discussed. Empirical study on benchmark datasets validates that DTLM improves the classification accuracy consistently compared with the state-of-theart transfer learning methods.

[1]  Hui Xiong,et al.  Exploiting Associations between Word Clusters and Document Classes for Cross-Domain Text Categorization , 2010, SDM.

[2]  Chris H. Q. Ding,et al.  Knowledge transformation for cross-domain sentiment classification , 2009, SIGIR.

[3]  Hujun Bao,et al.  Understanding the Power of Clause Learning , 2009, IJCAI.

[4]  Hui Xiong,et al.  Mining Distinction and Commonality across Multiple Domains Using Generative Model for Text Classification , 2012, IEEE Transactions on Knowledge and Data Engineering.

[5]  Yuhong Xiong,et al.  Erratum to "Mining Distinction and Commonality across Multiple Domains Using Generative Model for Text Classification" , 2012, IEEE Trans. Knowl. Data Eng..

[6]  Qiang Yang,et al.  Transfer Learning via Dimensionality Reduction , 2008, AAAI.

[7]  Ivor W. Tsang,et al.  Extracting discriminative concepts for domain adaptation in text mining , 2009, KDD.

[8]  Chris H. Q. Ding,et al.  Bridging Domains with Words: Opinion Analysis with Matrix Tri-factorizations , 2010, SDM.

[9]  Qiang Yang,et al.  Spectral domain-transfer learning , 2008, KDD.

[10]  Qiang Yang,et al.  Co-clustering based classification for out-of-domain documents , 2007, KDD '07.

[11]  Hans-Peter Kriegel,et al.  Integrating structured biological data by Kernel Maximum Mean Discrepancy , 2006, ISMB.

[12]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Chris H. Q. Ding,et al.  On Trivial Solution and Scale Transfer Problems in Graph Regularized NMF , 2011, IJCAI.

[14]  Feiping Nie,et al.  Cross-language web page classification via dual knowledge transfer using nonnegative matrix tri-factorization , 2011, SIGIR.

[15]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[16]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[17]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[18]  Jiawei Han,et al.  Knowledge transfer via multiple model local structure mapping , 2008, KDD.

[19]  Jianmin Wang,et al.  Transfer Learning with Graph Co-Regularization , 2012, IEEE Transactions on Knowledge and Data Engineering.

[20]  Jianmin Wang,et al.  Dual Transfer Learning , 2012, SDM.

[21]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.