Cross-domain sentiment classification via topical correspondence transfer

Sentiment classification aims to automatically predict sentiment polarity (e.g., positive or negative) of user generated sentiment data (e.g., reviews, blogs). In real applications, these users generated sentiment data can span so many different domains that it is difficult to manually label training data for all of them. In this article, we develop a general solution to cross-domain sentiment classification when we do not have any labeled data in a target domain but have some labeled data in a source domain. To bridge the gap between domains, we propose a novel algorithm, called topical correspondence transfer (TCT). This is achieved by learning the domain-specific information from different domains into unified topics, with the help of shared topics across all domains. In this way, the topical correspondences behind the shared topics can be used as a bridge to reduce the gap between domains. We conduct experiments on a benchmark composed of reviews of 4 types of Amazon products. Experimental results show that our proposed TCT significantly outperforms the baseline method, and achieves an accuracy which is competitive with the state-of-the-art methods for cross-domain sentiment classification. HighlightsWe propose a novel algorithm, called topical correspondence transfer (TCT) for cross-domain sentiment classification.The proposed framework is achieved by learning the domain-specific information from different domains into unified topics, with the help of shared topics across all domains.An objective function is defined to simultaneously learn a topical representation and document sentiment prediction with the model.

[1]  Amy Nicole Langville,et al.  Algorithms, Initializations, and Convergence for the Nonnegative Matrix Factorization , 2014, ArXiv.

[2]  Hyunsoo Kim,et al.  Nonnegative Matrix Factorization Based on Alternating Nonnegativity Constrained Least Squares and Active Set Method , 2008, SIAM J. Matrix Anal. Appl..

[3]  Chris H. Q. Ding,et al.  Knowledge transformation for cross-domain sentiment classification , 2009, SIGIR.

[4]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[5]  Qiang Yang,et al.  Cross-domain sentiment classification via spectral feature alignment , 2010, WWW '10.

[6]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[7]  Daniel Marcu,et al.  Domain Adaptation for Statistical Classifiers , 2006, J. Artif. Intell. Res..

[8]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[9]  Harith Alani,et al.  Automatically Extracting Polarity-Bearing Topics for Cross-Domain Sentiment Classification , 2011, ACL.

[10]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[11]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[12]  Christos Boutsidis,et al.  SVD based initialization: A head start for nonnegative matrix factorization , 2008, Pattern Recognit..

[13]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[14]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[15]  Chih-Jen Lin,et al.  Projected Gradient Methods for Nonnegative Matrix Factorization , 2007, Neural Computation.

[16]  Guodong Zhou,et al.  Active Learning for Cross-domain Sentiment Classification , 2013, IJCAI.

[17]  Danushka Bollegala,et al.  Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification , 2011, ACL.

[18]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[19]  Jun Zhao,et al.  Sentiment Classification with Graph Co-Regularization , 2014, COLING.