Towards Understanding Creative Language in Tweets

Extracting fine-grained information from social media is traditionally a challenging task, since the language used in social media messages is usually informal, with creative genre-specific terminology and expression. How to handle such a challenge so as to automatically understand the opinions that people are communicating has become a hot subject of research. In this paper, we aim to show that leveraging the pre-learned knowledge can help neural network models understand the creative language in Tweets. In order to address this idea, we present a transfer learning model based on BERT. We fine-turned the pre-trained BERT model and applied the customized model to two downstream tasks described in SemEval-2018: Irony Detection task and Emoji Prediction task of Tweets. Our model could achieve an F-score of 38.52 (ranked 1/49) in Emoji Prediction task and 67.52 (ranked 2/43) and 51.35 (ranked 1/31) in Irony Detection subtask A and subtask B. The experimental results validate the effectiveness of our idea.

[1]  Preslav Nakov,et al.  SemEval-2016 Task 4: Sentiment Analysis in Twitter. , 2019 .

[2]  Preslav Nakov,et al.  SemEval-2016 Task 4: Sentiment Analysis in Twitter , 2016, *SEMEVAL.

[3]  Nikos Pelekis,et al.  DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis , 2017, *SEMEVAL.

[4]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[5]  Georgios Paraskevopoulos,et al.  NTUA-SLP at SemEval-2018 Task 2: Predicting Emojis using RNNs with Context-aware Attention , 2018, SemEval@NAACL-HLT.

[6]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[7]  Tony Veale,et al.  IronyMagnet at SemEval-2018 Task 3: A Siamese network for Irony detection in Social media , 2018, *SEMEVAL.

[8]  Iyad Rahwan,et al.  Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm , 2017, EMNLP.

[9]  Horacio Saggion,et al.  Are Emojis Predictable? , 2017, EACL.

[10]  Saif Mohammad,et al.  SemEval-2018 Task 1: Affect in Tweets , 2018, *SEMEVAL.

[11]  Dennis Asamoah Owusu,et al.  UMDuluth-CS8761 at SemEval-2018 Task 2: Emojis: Too many Choices? , 2018, *SEMEVAL.

[12]  Çagri Çöltekin,et al.  Tübingen-Oslo at SemEval-2018 Task 2: SVMs perform better than RNNs in Emoji Prediction , 2018, SemEval@NAACL-HLT.

[13]  Horacio Saggion,et al.  SemEval 2018 Task 2: Multilingual Emoji Prediction , 2018, *SEMEVAL.

[14]  Ming Zhou,et al.  Coooolll: A Deep Learning System for Twitter Sentiment Classification , 2014, *SEMEVAL.

[15]  Georgios Paraskevopoulos,et al.  NTUA-SLP at SemEval-2018 Task 3: Tracking Ironic Tweets using Ensembles of Word and Character Level Attentive RNNs , 2018, *SEMEVAL.

[16]  Richard Evans,et al.  WLV at SemEval-2018 Task 3: Dissecting Tweets in Search of Irony , 2018, *SEMEVAL.

[17]  Saif Mohammad,et al.  NRC-Canada-2014: Recent Improvements in the Sentiment Analysis of Tweets , 2014, SemEval@COLING.

[18]  Véronique Hoste,et al.  SemEval-2018 Task 3: Irony Detection in English Tweets , 2018, *SEMEVAL.

[19]  Man Liu,et al.  EmoNLP at SemEval-2018 Task 2: English Emoji Prediction with Gradient Boosting Regression Tree Method and Bidirectional LSTM , 2018, *SEMEVAL.

[20]  Tomoaki Ohtsuki,et al.  A Pattern-Based Approach for Sarcasm Detection on Twitter , 2016, IEEE Access.

[21]  Preslav Nakov,et al.  SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[22]  Cees Snoek,et al.  Image2Emoji: Zero-shot Emoji Prediction for Visual Media , 2015, ACM Multimedia.

[23]  Véronique Hoste,et al.  Exploring the Realization of Irony in Twitter Data , 2016, LREC.

[24]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[25]  Chuhan Wu,et al.  THU_NGN at SemEval-2018 Task 3: Tweet Irony Detection with Densely connected LSTM and Multi-task Learning , 2018, *SEMEVAL.