Semi-supervised Acquisition of Croatian Sentiment Lexicon

Sentiment analysis aims to recognize subjectivity expressed in natural language texts. Subjectivity analysis tries to answer if the text unit is subjective or objective, while polarity analysis determines whether a subjective text is positive or negative. Sentiment of sentences and documents is often determined using some sort of a sentiment lexicon. In this paper we present three different semi-supervised methods for automated acquisition of a sentiment lexicon that do not depend on pre-existing language resources: latent semantic analysis, graph-based propagation, and topic modelling. Methods are language independent and corpus-based, hence especially suitable for languages for which resources are very scarce. We use the presented methods to acquire sentiment lexicon for Croatian language. The performance of the methods was evaluated on the task of determining both subjectivity and polarity at (subjectivity + polarity task) and the task of determining polarity of subjective words (polarity only task). The results indicate that the methods are especially suitable for the polarity only task.

[1]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[2]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[3]  Janyce Wiebe,et al.  Articles: Recognizing Contextual Polarity: An Exploration of Features for Phrase-Level Sentiment Analysis , 2009, CL.

[4]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[5]  Bing Liu,et al.  Mining Opinion Features in Customer Reviews , 2004, AAAI.

[6]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[7]  Sabine Bergler,et al.  Mining WordNet for a Fuzzy Sentiment: Sentiment Tag Extraction from WordNet Glosses , 2006, EACL.

[8]  Siddharth Patwardhan,et al.  Feature Subsumption for Opinion Analysis , 2006, EMNLP.

[9]  Swapna Somasundaran,et al.  QA with Attitude: Exploiting Opinion Type Analysis for Improving Question Answering in On-line Discussions and the News , 2007, ICWSM.

[10]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[11]  M. de Rijke,et al.  UvA-DARE ( Digital Academic Repository ) Using WordNet to measure semantic orientations of adjectives , 2004 .

[12]  S. Dumais Latent Semantic Analysis. , 2005 .

[13]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[14]  Michael Healy,et al.  Theory and Applications of Ontology: Computer Applications , 2010 .

[15]  Vasileios Hatzivassiloglou,et al.  Predicting the Semantic Orientation of Adjectives , 1997, ACL.

[16]  Jan Snajder,et al.  Automatic acquisition of inflectional lexica for morphological normalisation , 2008, Inf. Process. Manag..

[17]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[18]  Mike Thelwall,et al.  Sentiment strength detection for the social web , 2012, J. Assoc. Inf. Sci. Technol..

[19]  Andrea Esuli,et al.  PageRanking WordNet Synsets: An Application to Opinion Mining , 2007, ACL.