Transfer Learning by Linking Similar Feature Clusters for Sentiment Classification

Transfer learning aims to extract the knowledge from a label-rich source domain to enhance the predictive model of a target domain. Previous methods achieve knowledge transfer by detecting a shared low-dimensional feature representation from source domain to target domain. Along this line, many algorithms, e.g., dual transfer learning (DTL), triplex transfer learning (TRi-TL) etc., have been proposed and widely used for text classification. However, we argue that it is difficult for models to distinguish exactly the common concepts or identical concepts across different domains through the existing algorithms, even though source and target domains are related but different. So we propose to use the similar feature clusters as knowledge transfer, that is, we only guarantee the approximate similarity of common word clusters across different domains, rather than the exactly same. Based on the above assumption, the derived association matrices between word clusters and document classes should be slightly different to account for the word clusters variations. To take the above assumptions into account, we propose a novel Nonnegative Matrix Tri-Factorization based transfer learning by linking similar feature clusters (LSF-TL) for sentiment classification, in which an approximate constraint between similar word clusters matrices is added to allow differences while keeping the knowledge transferring function. Besides, LSF-TL also provides the same approximate constraint for the derived clusters association matrices. Then we employ an iterative updating algorithm with sound theoretical proof to find the local optimal solution. Last, we evaluate our method by conducting extensive experiments on Amazon product reviews. The results show that our approach achieves better classification accuracy than the state-of-the-art methods for both Cross-lingual sentiment classification(CLSC) and Cross-lingual cross-domain sentiment classification(CLCDSC) tasks.

[1]  Qiang Yang,et al.  Co-clustering based classification for out-of-domain documents , 2007, KDD '07.

[2]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[3]  Yong Yu,et al.  Cross-Lingual Sentiment Classification via Bi-view Non-negative Matrix Tri-Factorization , 2011, PAKDD.

[4]  Benno Stein,et al.  Cross-Lingual Adaptation Using Structural Correspondence Learning , 2010, TIST.

[5]  Tingting He,et al.  A Subspace Learning Framework for Cross-Lingual Sentiment Classification with Partial Parallel Data , 2015, IJCAI.

[6]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[7]  Fuzhen Zhuang,et al.  Concept Learning for Cross-Domain Text Classification: A General Probabilistic Framework , 2013, IJCAI.

[8]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[9]  Zhongzhi Shi,et al.  Triplex transfer learning: exploiting both shared and distinct concepts for text classification. , 2014, IEEE transactions on cybernetics.

[10]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[11]  Ivor W. Tsang,et al.  Transfer Learning for Cross-Language Text Categorization through Active Correspondences Construction , 2016, AAAI.

[12]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[13]  Min Xiao,et al.  A Novel Two-Step Method for Cross Language Representation Learning , 2013, NIPS.

[14]  Qiang Yang,et al.  Transitive Transfer Learning , 2015, KDD.

[15]  Feiping Nie,et al.  Cross-language web page classification via dual knowledge transfer using nonnegative matrix tri-factorization , 2011, SIGIR.

[16]  Jianmin Wang,et al.  Dual Transfer Learning , 2012, SDM.

[17]  Benno Stein,et al.  Cross-Language Text Classification Using Structural Correspondence Learning , 2010, ACL.

[18]  Hui Xiong,et al.  Exploiting associations between word clusters and document classes for cross-domain text categorization , 2011, Stat. Anal. Data Min..

[19]  Mingsheng Long,et al.  Topic Correlation Analysis for Cross-Domain Text Classification , 2012, AAAI.

[20]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[21]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[22]  Yuhong Zhang,et al.  Multi-bridge transfer learning , 2016, Knowl. Based Syst..

[23]  Deepak S. Turaga,et al.  Cross domain distribution adaptation via kernel mapping , 2009, KDD.