UPB at SemEval-2020 Task 9: Identifying Sentiment in Code-Mixed Social Media Texts Using Transformers and Multi-Task Learning

Sentiment analysis is a process widely used in opinion mining campaigns conducted today. This phenomenon presents applications in a variety of fields, especially in collecting information related to the attitude or satisfaction of users concerning a particular subject. However, the task of managing such a process becomes noticeably more difficult when it is applied in cultures that tend to combine two languages in order to express ideas and thoughts. By interleaving words from two languages, the user can express with ease, but at the cost of making the text far less intelligible for those who are not familiar with this technique, but also for standard opinion mining algorithms. In this paper, we describe the systems developed by our team for SemEval-2020 Task 9 that aims to cover two well-known code-mixed languages: Hindi-English and Spanish-English. We intend to solve this issue by introducing a solution that takes advantage of several neural network approaches, as well as pre-trained word embeddings. Our approach (multlingual BERT) achieves promising performance on the Hindi-English task, with an average F1-score of 0.6850, registered on the competition leaderboard, ranking our team 16th out of 62 participants. For the Spanish-English task, we obtained an average F1-score of 0.7064 ranking our team 17th out of 29 participants by using another multilingual Transformer-based model, XLM-RoBERTa.

[1]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[2]  Miguel Ángel García Cumbreras,et al.  TASS 2014 - Workshop on Sentiment Analysis at SEPLN: Overview , 2014 .

[3]  Omer Levy,et al.  GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.

[4]  Anil Kumar Singh,et al.  Language Identification in Code-Mixed Data using Multichannel Neural Networks and Context Capture , 2018, NUT@EMNLP.

[5]  Kunihiko Fukushima,et al.  Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Visual Pattern Recognition , 1982 .

[6]  Yue Zhang,et al.  A Bilingual Attention Network for Code-switched Emotion Prediction , 2016, COLING.

[7]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[8]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[10]  Tanmoy Chakraborty,et al.  SemEval-2020 Task 9: Overview of Sentiment Analysis of Code-Mixed Tweets , 2020, SEMEVAL.

[11]  Anil Kumar Singh,et al.  IIT (BHU) Submission for the ACL Shared Task on Named Entity Recognition on Code-switched Data , 2018, CodeSwitch@ACL.

[12]  Miguel A. Alonso,et al.  Sentiment Analysis on Monolingual, Multilingual and Code-Switching Twitter Corpora , 2015, WASSA@EMNLP.

[13]  Veselin Stoyanov,et al.  Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Geoffrey E. Hinton,et al.  Matrix capsules with EM routing , 2018, ICLR.

[16]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[17]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[18]  Preslav Nakov,et al.  SU-FMI: System Description for SemEval-2014 Task 9 on Sentiment Analysis in Twitter , 2014, *SEMEVAL.

[19]  Vinay Singh,et al.  An English-Hindi Code-Mixed Corpus: Stance Annotation and Baseline System , 2018, ArXiv.

[20]  Qiang Yang,et al.  An Overview of Multi-task Learning , 2018 .

[21]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[22]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[23]  Suraj Tripathi,et al.  Stance Detection in Code-Mixed Hindi-English Social Media Data using Multi-Task Learning , 2019, WASSA@NAACL-HLT.

[24]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[25]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[26]  Eva Schlinger,et al.  How Multilingual is Multilingual BERT? , 2019, ACL.