论文信息 - Classifier Calibration for Multi-Domain Sentiment Classification

Classifier Calibration for Multi-Domain Sentiment Classification

Textual sentiment classifiers classify texts into a fixed number of affective classes, such as positive, negative or neutral sentiment, or subjective versus objective information. It has been observed that sentiment classifiers suffer from a lack of generalization capability: a classifier trained on a certain domain generally performs worse on data from another domain. This phenomenon has been attributed to domain-specific affective vocabulary. In this paper1, we propose a voting-based thresholding approach, which calibrates a number of existing single-domain classifiers with respect to sentiment data from a new domain. The approach presupposes only a small amount of annotated data from the new domain. We evaluate three criteria for estimating thresholds, and discuss the ramifications of these criteria for the trade-off between classifier performance and manual annotation effort. Textual sentiment classifiers classify texts into a fixed number of affective classes, such as positive, negative or neutral sentiment, or subjective versus objective information. It has been observed that sentiment classifiers suffer from a lack of generalization capability: a classifier trained on a certain domain generally performs worse on data from another domain. This phenomenon has been attributed to domain-specific affective vocabulary. In this paper, we propose a voting-based thresholding approach, which calibrates a number of existing single-domain classifiers with respect to sentiment data from a new domain. The approach presupposes only a small amount of annotated data from the new domain. We evaluate three criteria for estimating thresholds, and discuss the ramifications of these criteria for the trade-off between classifier performance and manual annotation effort.

Wessel Kraaij | Stephan Raaijmakers

[1] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[2] Wessel Kraaij,et al. Polarity Classification of Blog TREC 2008 Data with a Geodesic Kernel , 2008, TREC.

[3] John Blitzer,et al. Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[4] Chris H. Q. Ding,et al. Knowledge transformation for cross-domain sentiment classification , 2009, SIGIR.

[5] Michael Gamon,et al. Customizing Sentiment Classifiers to New Domains: a Case Study , 2019 .

[6] Simone Teufel,et al. An Overview of Evaluation Methods in TREC Ad Hoc Information Retrieval and TREC Question Answering , 2007 .

[7] Koby Crammer,et al. A theory of learning from different domains , 2010, Machine Learning.

[8] Xi Chen,et al. Text classification with kernels on the multinomial manifold , 2005, SIGIR '05.

[9] Sabine Bergler,et al. When Specialists and Generalists Work Together: Overcoming Domain Dependence in Sentiment Tagging , 2008, ACL.

[10] Chengqing Zong,et al. Multi-domain Sentiment Classification , 2008, ACL.

[11] Wessel Kraaij,et al. Maximizing classifier utility for a given accuracy , 2008 .

[12] S. A. Raaijmakers,et al. Multinomial Language Learning: Investigations into the Geometry of Language , 2009 .

[13] Qiong Wu,et al. Graph Ranking for Sentiment Transfer , 2009, ACL.