Triplex Transfer Learning: Exploiting Both Shared and Distinct Concepts for Text Classification

Transfer learning focuses on the learning scenarios when the test data from target domains and the training data from source domains are drawn from similar but different data distributions with respect to the raw features. Along this line, some recent studies revealed that the high-level concepts, such as word clusters, could help model the differences of data distributions, and thus are more appropriate for classification. In other words, these methods assume that all the data domains have the same set of shared concepts, which are used as the bridge for knowledge transfer. However, in addition to these shared concepts, each domain may have its own distinct concepts. In light of this, we systemically analyze the high-level concepts, and propose a general transfer learning framework based on nonnegative matrix trifactorization, which allows to explore both shared and distinct concepts among all the domains simultaneously. Since this model provides more flexibility in fitting the data, it can lead to better classification accuracy. Moreover, we propose to regularize the manifold structure in the target domains to improve the prediction performances. To solve the proposed optimization problem, we also develop an iterative algorithm and theoretically analyze its convergence properties. Finally, extensive experiments show that the proposed model can outperform the baseline methods with a significant margin. In particular, we show that our method works much better for the more challenging tasks when there are distinct concepts in the data.

[1]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[2]  ChengXiang Zhai,et al.  A two-stage approach to domain adaptation for statistical classifiers , 2007, CIKM '07.

[3]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[4]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[5]  Dacheng Tao,et al.  Evolutionary Cross-Domain Discriminative Hessian Eigenmaps , 2010, IEEE Transactions on Image Processing.

[6]  Rong Pan,et al.  Bi-Weighting Domain Adaptation for Cross-Language Text Classification , 2011, IJCAI.

[7]  Dan Zhang,et al.  Multi-view transfer learning with a large margin approach , 2011, KDD.

[8]  Hui Xiong,et al.  Exploiting associations between word clusters and document classes for cross-domain text categorization , 2011, Stat. Anal. Data Min..

[9]  Dacheng Tao,et al.  Bregman Divergence-Based Regularization for Transfer Subspace Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[10]  Koby Crammer,et al.  Multi-domain learning by confidence-weighted parameter combination , 2010, Machine Learning.

[11]  Dacheng Tao,et al.  Sparse transfer learning for interactive video search reranking , 2012, TOMCCAP.

[12]  Feiping Nie,et al.  Cross-language web page classification via dual knowledge transfer using nonnegative matrix tri-factorization , 2011, SIGIR.

[13]  Qiang Yang,et al.  Topic-bridged PLSA for cross-domain text classification , 2008, SIGIR '08.

[14]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[15]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[16]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[17]  Zhongzhi Shi,et al.  Triplex transfer learning: exploiting both shared and distinct concepts for text classification. , 2014, IEEE transactions on cybernetics.

[18]  Fei Wang,et al.  Semi-Supervised Clustering via Matrix Factorization , 2008, SDM.

[19]  Quanquan Gu,et al.  Learning the Shared Subspace for Multi-task Clustering and Transductive Transfer Classification , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[20]  Jérôme Idier,et al.  Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.

[21]  Jiawei Han,et al.  Knowledge transfer via multiple model local structure mapping , 2008, KDD.

[22]  Qiang Yang,et al.  EigenTransfer: a unified framework for transfer learning , 2009, ICML '09.

[23]  Yizhou Sun,et al.  Heterogeneous source consensus learning via decision propagation and negotiation , 2009, KDD.

[24]  Svetha Venkatesh,et al.  Nonnegative shared subspace learning and its application to social media retrieval , 2010, KDD.

[25]  Hui Xiong,et al.  Collaborative Dual-PLSA: mining distinction and commonality across multiple domains for text classification , 2010, CIKM.

[26]  Bo Geng,et al.  DAML: Domain Adaptation Metric Learning , 2011, IEEE Transactions on Image Processing.

[27]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[28]  Jaime G. Carbonell,et al.  Feature Selection for Transfer Learning , 2011, ECML/PKDD.

[29]  Qiang Yang,et al.  Translated Learning: Transfer Learning across Different Feature Spaces , 2008, NIPS.

[30]  Qiang Yang,et al.  Co-clustering based classification for out-of-domain documents , 2007, KDD '07.

[31]  ChengXiang Zhai,et al.  Instance Weighting for Domain Adaptation in NLP , 2007, ACL.

[32]  Chris H. Q. Ding,et al.  Knowledge transformation for cross-domain sentiment classification , 2009, SIGIR.

[33]  Chris H. Q. Ding,et al.  Knowledge transformation from word space to document space , 2008, SIGIR '08.

[34]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[35]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[36]  Jianmin Wang,et al.  Dual Transfer Learning , 2012, SDM.

[37]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[38]  Hui Xiong,et al.  Transfer learning from multiple source domains via consensus regularization , 2008, CIKM '08.

[39]  Qiang Yang,et al.  Transfer Learning via Dimensionality Reduction , 2008, AAAI.