论文信息 - Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification

Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification

Automatic sentiment classification has been extensively studied and applied in recent years. However, sentiment is expressed differently in different domains, and annotating corpora for every possible domain of interest is impractical. We investigate domain adaptation for sentiment classifiers, focusing on online reviews for different types of products. First, we extend to sentiment classification the recently-proposed structural correspondence learning (SCL) algorithm, reducing the relative error due to adaptation between domains by an average of 30% over the original SCL algorithm and 46% over a supervised baseline. Second, we identify a measure of domain similarity that correlates well with the potential for adaptation of a classifier from one domain to another. This measure could for instance be used to select a small set of domains to annotate whose trained classifiers would transfer well to many other domains.

[1] Bo Pang,et al. Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[2] Peter D. Turney. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[3] Tong Zhang,et al. Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.

[4] Xiaoqiang Luo,et al. A Statistical Model for Multilingual Entity Detection and Tracking , 2004, NAACL.

[5] Alex Acero,et al. Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..

[6] Bo Pang,et al. Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[7] Tong Zhang,et al. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[8] Matt Thomas,et al. Get out the vote: Determining support or opposition from Congressional floor-debate transcripts , 2006, EMNLP.

[9] Xiaojin Zhu,et al. Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization , 2006 .

[10] Koby Crammer,et al. Analysis of Representations for Domain Adaptation , 2006, NIPS.

[11] John Blitzer,et al. Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.