Emoji-Based Transfer Learning for Sentiment Tasks

Sentiment tasks such as hate speech detection and sentiment analysis, especially when performed on languages other than English, are often low-resource. In this study, we exploit the emotional information encoded in emojis to enhance the performance on a variety of sentiment tasks. This is done using a transfer learning approach, where the parameters learned by an emoji-based source task are transferred to a sentiment target task. We analyse the efficacy of the transfer under three conditions, i.e. i) the emoji content and ii) label distribution of the target task as well as iii) the difference between monolingually and multilingually learned source tasks. We find i.a. that the transfer is most beneficial if the target task is balanced with high emoji content. Monolingually learned source tasks have the benefit of taking into account the culturally specific use of emojis and gain up to F1 +0.280 over the baseline.

[1]  Jayashree Subramanian,et al.  Exploiting Emojis for Sarcasm Detection , 2019, SBP-BRiMS.

[2]  Richard Socher,et al.  Learned in Translation: Contextualized Word Vectors , 2017, NIPS.

[3]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[4]  M. Cha,et al.  Cross-Cultural Comparison of Nonverbal Cues in Emoticons on Twitter: Evidence from Big Data Analysis , 2014 .

[5]  Pascale Fung,et al.  One-step and Two-step Classification for Abusive Language Detection on Twitter , 2017, ALW@ACL.

[6]  Nancy Ide,et al.  Distant Supervision for Emotion Classification with Discrete Binary Values , 2013, CICLing.

[7]  Aurélien Lucchi,et al.  SwissCheese at SemEval-2016 Task 4: Sentiment Classification Using an Ensemble of Convolutional Neural Networks with Distant Supervision , 2016, *SEMEVAL.

[8]  Preslav Nakov,et al.  SemEval-2016 Task 4: Sentiment Analysis in Twitter , 2016, *SEMEVAL.

[9]  Joaquín Padilla Montani,et al.  GermEval 2018 : German Abusive Tweet Detection , 2018 .

[10]  Isabelle Augenstein,et al.  emoji2vec: Learning Emoji Representations from their Description , 2016, SocialNLP@EMNLP.

[11]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[13]  Iyad Rahwan,et al.  Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm , 2017, EMNLP.

[14]  Paolo Rosso,et al.  SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter , 2019, *SEMEVAL.

[15]  Prasenjit Majumder,et al.  Overview of the HASOC track at FIRE 2019: Hate Speech and Offensive Content Identification in Indo-European Languages , 2019, FIRE.

[16]  Fatih Uzdilli,et al.  A Twitter Corpus and Benchmark Resources for German Sentiment Analysis , 2017, SocialNLP@EACL.

[17]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[18]  Derek Ruths,et al.  A Web of Hate: Tackling Hateful Speech in Online Social Spaces , 2017, ArXiv.

[19]  Veselin Stoyanov,et al.  Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[20]  Michael Wiegand,et al.  Overview of the GermEval 2018 Shared Task on the Identification of Offensive Language , 2018 .

[21]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[22]  Gerard de Melo,et al.  A Helping Hand: Transfer Learning for Deep Sentiment Analysis , 2018, ACL.

[23]  Jan B. F. van Erp,et al.  EmojiGrid: A 2D pictorial scale for cross-cultural emotion assessment of negatively and positively valenced food. , 2019, Food research international.

[24]  Dirk Hovy,et al.  Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.