Using unlabeled data to handle domain-transfer problem of semantic detection

Due to highly domain-specific nature, supervised sentiment classifiers typically require a large number of new labeled training data when transferred to another domain. This is so-called domaintransfer problem. In this work, we attempt to tackle this problem by combining old-domain labeled examples with new-domain unlabeled ones. The basic idea is to use old-domain-trained classifier to label some informative unlabeled examples in new domain, and train the base classifier again. The experimental results demonstrate that proposed method dramatically boosts the accuracy of the base sentiment classifier on new domain.

[1]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[2]  George Karypis,et al.  Centroid-Based Document Classification: Analysis and Experimental Results , 2000, PKDD.

[3]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[4]  ChengXiang Zhai,et al.  Instance Weighting for Domain Adaptation in NLP , 2007, ACL.

[5]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[6]  Carsten Lanquillon Learning from Labeled and Unlabeled Documents: A Comparative Study on Semi-Supervised Text Classification , 2000, PKDD.

[7]  Yiming Yang,et al.  A study of thresholding strategies for text categorization , 2001, SIGIR '01.

[8]  Alistair Kennedy,et al.  Sentiment Classification of Movie and Product Reviews Using Contextual Valence Shifters , 2005 .

[9]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[10]  Nigel Collier,et al.  Sentiment Analysis using Support Vector Machines with Diverse Information Sources , 2004, EMNLP.

[11]  Sebastian Thrun,et al.  Learning to Classify Text from Labeled and Unlabeled Documents , 1998, AAAI/IAAI.

[12]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[13]  Eui-Hong,et al.  Centroid-Based Document Classifica tion : Analysis & Exper imental Results ∗ , 2000 .

[14]  Aidan Finn,et al.  Learning to classify documents according to genre , 2006, J. Assoc. Inf. Sci. Technol..

[15]  Shlomo Argamon,et al.  Using appraisal groups for sentiment analysis , 2005, CIKM '05.

[16]  Vibhu O. Mittal,et al.  Comparative Experiments on Sentiment Classification for Online Product Reviews , 2006, AAAI.