Semantic Spaces for Sentiment Analysis

This article presents a new semi-supervised method for document-level sentiment analysis. We employ a supervised state-of-the-art classification approach and enrich the feature set by adding word cluster features. These features exploit clusters of words represented in semantic spaces computed on unlabeled data. We test our method on three large sentiment datasets (Czech movie and product reviews, and English movie reviews) and outperform the current state of the art. To the best of our knowledge, this article reports the first successful incorporation of semantic spaces based on local word co-occurrence in the sentiment analysis task.

[1]  Josef Steinberger,et al.  Sentiment Analysis in Czech Social Media Using Supervised Machine Learning , 2013, WASSA@NAACL-HLT.

[2]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[3]  Keith Stevens,et al.  The S-Space Package: An Open Source Package for Word Space Models , 2010, ACL.

[4]  E. Rosch,et al.  Family resemblances: Studies in the internal structure of categories , 1975, Cognitive Psychology.

[5]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[6]  Rada Mihalcea,et al.  Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Langu , 2011, ACL 2011.

[7]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[8]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[9]  Magnus Sahlgren,et al.  An Introduction to Random Indexing , 2005 .

[10]  W. Charles Contextual correlates of meaning , 2000, Applied Psycholinguistics.

[11]  Timothy W. Finin,et al.  Delta TFIDF: An Improved Feature Space for Sentiment Analysis , 2009, ICWSM.

[12]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[13]  Man Lung Yiu,et al.  Group-by skyline query processing in relational engines , 2009, CIKM.

[14]  J. R. Firth,et al.  Studies in Linguistic Analysis. , 1974 .

[15]  Douglas L. T. Rohde An Improved Method for Deriving Word Meaning from Lexical Co-Occurrence , 2004 .

[16]  Tom,et al.  Semantic Spaces for Improving Language Modeling I , 2013 .

[17]  Brendan T. O'Connor,et al.  Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments , 2010, ACL.

[18]  Andrés Montoyo,et al.  Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis , 2013 .

[19]  Miloslav Konopík,et al.  Semantic spaces for improving language modeling , 2014, Comput. Speech Lang..

[20]  J. Nocedal Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[21]  Michael N Jones,et al.  Representing word meaning and order information in a composite holographic lexicon. , 2007, Psychological review.

[22]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[23]  Trevor Cohen,et al.  Reflective Random Indexing and indirect inference: A scalable method for discovery of implicit connections , 2010, J. Biomed. Informatics.

[24]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[25]  Jan Hajic,et al.  Creating annotated resources for polarity classification in Czech , 2012, KONVENS.

[26]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[27]  Roger K. Moore Computer Speech and Language , 1986 .

[28]  G. Karypis,et al.  Criterion Functions for Document Clustering ∗ Experiments and Analysis , 2001 .