Transfer Learning for Multiple-Domain Sentiment Analysis - Identifying Domain Dependent/Independent Word Polarity

Sentiment analysis is the task of determining the attitude (positive or negative) of documents. While the polarity of words in the documents is informative for this task, polarity of some words cannot be determined without domain knowledge. Detecting word polarity thus poses a challenge for multiple-domain sentiment analysis. Previous approaches tackle this problem with transfer learning techniques, but they cannot handle multiple source domains and multiple target domains. This paper proposes a novel Bayesian probabilistic model to handle multiple source and multiple target domains. In this model, each word is associated with three factors: Domain label, domain dependence/independence and word polarity. We derive an efficient algorithm using Gibbs sampling for inferring the parameters of the model, from both labeled and unlabeled texts. Using real data, we demonstrate the effectiveness of our model in a document polarity classification task compared with a method not considering the differences between domains. Moreover our method can also tell whether each word's polarity is domain-dependent or domain-independent. This feature allows us to construct a word polarity dictionary for each domain.

[1]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[2]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[3]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[4]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[5]  Hongfei Yan,et al.  Jointly Modeling Aspects and Opinions with a MaxEnt-LDA Hybrid , 2010, EMNLP.

[6]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[7]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Yuji Matsumoto,et al.  Extracting Aspect-Evaluation and Aspect-Of Relations in Opinion Mining , 2007, EMNLP.

[9]  Takashi Inui,et al.  Extracting Semantic Orientations of Words using Spin Model , 2005, ACL.

[10]  Qiang Yang,et al.  Transferring Knowledge from Another Domain for Learning Action Models , 2008, PRICAI.

[11]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[12]  Ivan Titov,et al.  Modeling online reviews with multi-grain topic models , 2008, WWW.

[13]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[14]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[15]  Yulan He,et al.  A Comparative Study of Bayesian Models for Unsupervised Sentiment Detection , 2010, CoNLL.

[16]  Daniel Marcu,et al.  Domain Adaptation for Statistical Classifiers , 2006, J. Artif. Intell. Res..

[17]  Noémie Elhadad,et al.  An Unsupervised Aspect-Sentiment Model for Online Reviews , 2010, NAACL.

[18]  T. Minka Estimating a Dirichlet distribution , 2012 .

[19]  Qiang Yang,et al.  Can Movies and Books Collaborate? Cross-Domain Collaborative Filtering for Sparsity Reduction , 2009, IJCAI.