Sentiment translation for low resourcedlanguages: experiments on Irish general electionTweets

This paper presents two main methods of Sentiment Analysis (SA) of User-Generated Content for a low-resource language: Irish. The first method, automatic sentiment translation, applies existing English SA resources to both manually- and automatically-translated tweets. We obtained an accuracy of 70% using this approach. The second method involved the manual creation of an Irish-language sentiment lexicon: SentiFocloir. This lexicon was used to build the first Irish SA system, SentiFocalTweet, which produced superior results to the first method, with an accuracy of 76%. This demonstrates that translation from Irish to English has a minor effect on the preservation of sentiment; it is also shown that the SentiFocalTweet system is a successful baseline system for Irish sentiment analysis.

[1]  Saif Mohammad,et al.  Sentiment after Translation: A Case-Study on Arabic Social Media Posts , 2015, NAACL.

[2]  Gerald Penn,et al.  Evaluating Sentiment Analysis Evaluation: A Case Study in Securities Trading , 2014, WASSA@ACL.

[3]  Xiaolong Wang,et al.  Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach , 2011, CIKM '11.

[4]  Jeremy Ellman,et al.  Using SentiWordNet and Sentiment Analysis for Detecting Radical Content on Web Forums , 2012 .

[5]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[6]  Kevin Scannell lemonGAWN-WordNet Gaeilge as Linked Data , 2016 .

[7]  Hans Uszkoreit,et al.  The Irish Language in the Digital Age , 2012 .

[8]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[9]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[10]  ThelwallMike,et al.  Sentiment strength detection in short informal text , 2010 .

[11]  Saif Mohammad,et al.  How Translation Alters Sentiment , 2016, J. Artif. Intell. Res..

[12]  Daniel Dajun Zeng,et al.  Twitter Sentiment Analysis: A Bootstrap Ensemble Framework , 2013, 2013 International Conference on Social Computing.

[13]  Teresa Lynn,et al.  Minority Language Twitter: Part-of-Speech Tagging and Analysis of Irish Tweets , 2015, NUT@IJCNLP.

[14]  Horacio Saggion,et al.  Interpreting SentiWordNet for Opinion Classification , 2010, LREC.

[15]  Ahmed Abbasi,et al.  Benchmarking Twitter Sentiment Analysis Tools , 2014, LREC.

[16]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[17]  Lars Kai Hansen,et al.  Good Friends, Bad News - Affect and Virality in Twitter , 2011, ArXiv.

[18]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[19]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[20]  Steven Skiena,et al.  Trading Strategies to Exploit Blog and News Sentiment , 2010, ICWSM.

[21]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[22]  Harith Alani,et al.  Alleviating Data Sparsity for Twitter Sentiment Analysis , 2012, #MSM.

[23]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.