Combining a large sentiment lexicon and machine learning for subjectivity classification

Most previous work on subjectivity/sentiment classification bases on either machine learning techniques (such as SVM, Maximum Entropy, Naive Bayes, etc.) or general sentiment lexicons. This paper presents a novel approach to combine a large sentiment lexicon and machine learning techniques for opinion analysis: 1) a large sentiment lexicon is automatically adjusted according to training data; 2) machine learning techniques are used to learn models on training data; 3) the results given by machine learning classifiers and the supervised lexicon-based classifier are combined to get better results. The experiments with the NTCIR data show that our approach significantly outperforms the baselines on subjectivity classification, i.e. the adjusted large sentiment lexicon shows good performance and its combination with machine learning techniques shows further improvement.

[1]  Lina Zhou,et al.  Movie Review Mining: a Comparison between Supervised and Unsupervised Classification Approaches , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[2]  Bin Lin,et al.  Sentiment classification for Chinese reviews: a comparison between SVM and semantic approaches , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[3]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[4]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[5]  Likun Qiu,et al.  SELC: a self-supervised model for sentiment classification , 2009, CIKM.

[6]  Janyce Wiebe,et al.  Learning Subjective Language , 2004, CL.

[7]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[8]  Hsin-Hsi Chen,et al.  Overview of Opinion Analysis Pilot Task at NTCIR-6 , 2007, NTCIR.

[9]  Xiaojun Wan,et al.  Using Bilingual Knowledge and Ensemble Techniques for Unsupervised Chinese Sentiment Analysis , 2008, EMNLP.

[10]  Hsin-Hsi Chen,et al.  Overview of Multilingual Opinion Analysis Task at NTCIR-7 , 2008, NTCIR.

[11]  Tom B. Y. Lai,et al.  Polarity Classification of Celebrity Coverage in the Chinese Press , 2005 .

[12]  Janyce Wiebe,et al.  Annotating Opinions in the World Press , 2003, SIGDIAL Workshop.

[13]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[14]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[15]  Oi Yee Kwong,et al.  Supervised Approaches and Ensemble Techniques for Chinese Opinion Analysis at NTCIR-7 , 2008, NTCIR.

[16]  Sabine Bergler,et al.  When Specialists and Generalists Work Together: Overcoming Domain Dependence in Sentiment Tagging , 2008, ACL.

[17]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[18]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.