Generating a Non-English Subjectivity Lexicon: Relations That Matter

We describe a method for creating a non-English subjectivity lexicon based on an English lexicon, an online translation service and a general purpose thesaurus: Wordnet. We use a PageRank-like algorithm to bootstrap from the translation of the English lexicon and rank the words in the thesaurus by polarity using the network of lexical relations in Wordnet. We apply our method to the Dutch language. The best results are achieved when using synonymy and antonymy relations only, and ranking positive and negative words simultaneously. Our method achieves an accuracy of 0.82 at the top 3,000 negative words, and 0.62 at the top 3,000 positive words.

[1]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[2]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[3]  Katja Hofmann,et al.  The Cornetto Database: Architecture and User-Scenarios , 2007 .

[4]  Rada Mihalcea,et al.  A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources , 2008, LREC.

[5]  Katja Hofmann,et al.  Task-based Evaluation Report: Building a Dutch Subjectivity Lexicon , 2008 .

[6]  Andrea Esuli,et al.  PageRanking WordNet Synsets: An Application to Opinion Mining , 2007, ACL.

[7]  Ronald Fagin,et al.  Comparing and aggregating rankings with ties , 2004, PODS '04.

[8]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[9]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[10]  Ellen Riloff,et al.  Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[11]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[12]  Rada Mihalcea,et al.  Learning Multilingual Subjective Language via Cross-Lingual Projections , 2007, ACL.

[13]  Piek Vossen,et al.  EuroWordNet: A multilingual database with lexical semantic networks , 1998, Springer Netherlands.

[14]  Hugo Liu,et al.  A Corpus-based Approach to Finding Happiness , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[15]  Tejashri Inadarchand Jain,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2010 .