Sentiment Analysis using Support Vector Machines with Diverse Information Sources

This paper introduces an approach to sentiment analysis which uses support vector machines (SVMs) to bring together diverse sources of potentially pertinent information, including several favorability measures for phrases and adjectives and, where available, knowledge of the topic of the text. Models using the features introduced are further combined with unigram models which have been shown to be effective in the past (Pang et al., 2002) and lemmatized versions of the unigram models. Experiments on movie review data from Epinions.com demonstrate that hybrid SVMs which combine unigram-style feature-based SVMs with those based on real-valued favorability measures obtain superior performance, producing the best results yet published using this data. Further experiments using a feature set enriched with topic information on a smaller dataset of music reviews handannotated for topic are also reported, the results of which suggest that incorporating topic information into such models may also yield improvement.

[1]  Timo Järvinen,et al.  A non-projective dependency parser , 1997, ANLP.

[2]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[3]  Nigel Collier,et al.  A Framework for Integrating Deep and Shallow Semantic Structures in Text Mining , 2003, KES.

[4]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[5]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[6]  Jeonghee Yi,et al.  Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.

[7]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[8]  J. Kamps,et al.  Words with attitude , 2002 .

[9]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[10]  Janyce Wiebe,et al.  Learning Subjective Adjectives from Corpora , 2000, AAAI/IAAI.

[11]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[12]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[13]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[14]  J. M. Kittross The measurement of meaning , 1959 .

[15]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[16]  Janyce Wiebe,et al.  Instructions for annotating opinions in newspaper articles , 2002 .