Textual sentiment classifiers classify texts into a fixed number of affective classes, such as positive, negative or neutral sentiment, or subjective versus objective information. It has been observed that sentiment classifiers suffer from a lack of generalization capability: a classifier trained on a certain domain generally performs worse on data from another domain. This phenomenon has been attributed to domain-specific affective vocabulary. In this paper1, we propose a voting-based thresholding approach, which calibrates a number of existing single-domain classifiers with respect to sentiment data from a new domain. The approach presupposes only a small amount of annotated data from the new domain. We evaluate three criteria for estimating thresholds, and discuss the ramifications of these criteria for the trade-off between classifier performance and manual annotation effort. Textual sentiment classifiers classify texts into a fixed number of affective classes, such as positive, negative or neutral sentiment, or subjective versus objective information. It has been observed that sentiment classifiers suffer from a lack of generalization capability: a classifier trained on a certain domain generally performs worse on data from another domain. This phenomenon has been attributed to domain-specific affective vocabulary. In this paper, we propose a voting-based thresholding approach, which calibrates a number of existing single-domain classifiers with respect to sentiment data from a new domain. The approach presupposes only a small amount of annotated data from the new domain. We evaluate three criteria for estimating thresholds, and discuss the ramifications of these criteria for the trade-off between classifier performance and manual annotation effort.
[1]
Chih-Jen Lin,et al.
LIBSVM: A library for support vector machines
,
2011,
TIST.
[2]
Wessel Kraaij,et al.
Polarity Classification of Blog TREC 2008 Data with a Geodesic Kernel
,
2008,
TREC.
[3]
John Blitzer,et al.
Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification
,
2007,
ACL.
[4]
Chris H. Q. Ding,et al.
Knowledge transformation for cross-domain sentiment classification
,
2009,
SIGIR.
[5]
Michael Gamon,et al.
Customizing Sentiment Classifiers to New Domains: a Case Study
,
2019
.
[6]
Simone Teufel,et al.
An Overview of Evaluation Methods in TREC Ad Hoc Information Retrieval and TREC Question Answering
,
2007
.
[7]
Koby Crammer,et al.
A theory of learning from different domains
,
2010,
Machine Learning.
[8]
Xi Chen,et al.
Text classification with kernels on the multinomial manifold
,
2005,
SIGIR '05.
[9]
Sabine Bergler,et al.
When Specialists and Generalists Work Together: Overcoming Domain Dependence in Sentiment Tagging
,
2008,
ACL.
[10]
Chengqing Zong,et al.
Multi-domain Sentiment Classification
,
2008,
ACL.
[11]
Wessel Kraaij,et al.
Maximizing classifier utility for a given accuracy
,
2008
.
[12]
S. A. Raaijmakers,et al.
Multinomial Language Learning: Investigations into the Geometry of Language
,
2009
.
[13]
Qiong Wu,et al.
Graph Ranking for Sentiment Transfer
,
2009,
ACL.