SentiWordNet for New Language: Automatic Translation Approach

This paper proposes an automatic translation approach to create a sentiment lexicon for a new language from available English resources. In this approach, an automatic mapping is generated from a sense-level resource to a wordlevel by applying a triple unification process. This process produces a single polarity score for each term by incorporating all sense polarities. The major idea is to deal with the sense ambiguity during the lexicon transfer and provide a general sentiment lexicon for languages like Turkish which do not have a freely available machine-readable dictionary. On the other hand, the translation quality is critical in the lexicon transfer due to the ambiguity problem. Thus, this paper also proposes a multiple bilingual translation approach to find the most appropriate equivalents for the source language terms. In this approach, three parallel, series and hybrid algorithms are used to integrate the translation results. Finally, three lexicons are achieved for the target language with different sizes. The performance of three lexicons is evaluated in the lexicon-based sentiment classification task and compared with the results achieved by the supervised approach. According to experimental results, the proposed approach can produce reliable sentiment lexicons for the target language.

[1]  Janyce Wiebe,et al.  Just How Mad Are You? Finding Strong and Weak Opinion Clauses , 2004, AAAI.

[2]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[3]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[4]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[5]  Andrea Esuli,et al.  Determining the semantic orientation of terms through gloss classification , 2005, CIKM '05.

[6]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[7]  Rada Mihalcea,et al.  Learning Multilingual Subjective Language via Cross-Lingual Projections , 2007, ACL.

[8]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[9]  Verónica Pérez-Rosas,et al.  Learning Sentiment Lexicons in Spanish , 2012, LREC.

[10]  Ming Zhou,et al.  Cross-lingual Sentiment Lexicon Learning With Bilingual Word Graph Label Propagation , 2015, CL.

[11]  Kemal Oflazer,et al.  SentiTurkNet: a Turkish polarity lexicon for sentiment analysis , 2016, Lang. Resour. Evaluation.

[12]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[13]  Takashi Inui,et al.  Extracting Semantic Orientations of Words using Spin Model , 2005, ACL.

[14]  Sivaji Bandyopadhyay,et al.  Subjectivity Detection in English and Bengali: A CRF-based Approach , 2009 .

[15]  Kemal Oflazer,et al.  Building a wordnet for Turkish , 2004 .

[16]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[17]  Ulli Waltinger,et al.  GermanPolarityClues: A Lexical Resource for German Sentiment Analysis , 2010, LREC.

[18]  Berkant Barla Cambazoglu,et al.  A Framework for Sentiment Analysis in Turkish: Application to Polarity Detection of Movie Reviews in Turkish , 2012, ISCIS.

[19]  Sivaji Bandyopadhyay,et al.  Theme detection an exploration of opinion subjectivity , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[20]  Sivaji Bandyopadhyay,et al.  SentiWordNet for Indian Languages , 2010 .

[21]  Marco Baroni,et al.  Identifying subjective adjectives through web-based mutual information , 2004 .

[22]  Yücel Saygin,et al.  Adaptation and Use of Subjectivity Lexicons for Domain Dependent Sentiment Classification , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[23]  Ismail Hakki Toroslu,et al.  Sentiment Analysis of Turkish Political News , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[24]  Ebru Akcapinar Sezer,et al.  Assessment of Feature Selection Metrics for Sentiment Analyses: Turkish Movie Reviews , 2014 .

[25]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[26]  Ellen Riloff,et al.  Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[27]  Adil Alpkocak,et al.  Emotion Extraction from Turkish Text , 2014, 2014 European Network Intelligence Conference.

[28]  Ebru Akcapinar Sezer,et al.  Imbalanced Text Categorization Based on Positive and Negative Term Weighting Approach , 2015, TSD.

[29]  Alan R. Fersht,et al.  New Look and New Outlook , 2002 .

[30]  Franz Josef Och Statistical Machine Translation: Foundations and Recent Advances , 2005, MTSUMMIT.