论文信息 - FII-UAIC at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text Using CNN

FII-UAIC at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text Using CNN

The “Sentiment Analysis for Code-Mixed Social Media Text” task at the SemEval 2020 competition focuses on sentiment analysis in code-mixed social media text 1 , specifically, on the combination of English with Spanish (Spanglish) and Hindi (Hinglish). In this paper, we present a system able to classify tweets, from Spanish and English languages, into positive, negative and neutral. Firstly, we built a classifier able to provide corresponding sentiment labels. Besides the sentiment labels, we provide the language labels at the word level. Secondly, we generate a word-level representation, using Convolutional Neural Network (CNN) architecture. Our solution indicates promising results for the Sentimix Spanglish-English task (0.744), the team, Lavinia_Ap, occupied the 9th place. However, for the Sentimix Hindi-English task (0.324) the results have to be improved.

Daniela Gîfu | Lavinia Aparaschivei | Andrei Palihovici

[1] Rosalyn Negrón Goldbarg. Spanish-English Codeswitching in Email Communication , 2009 .

[2] Dipankar Das,et al. Language Identification of Bengali-English Code-Mixed Data using Character & Phonetic based LSTM Models , 2019, FIRE.

[3] Philipp Koehn,et al. De-Mixing Sentiment from Code-Mixed Text , 2019, ACL.

[4] Ellen Contini-Morava,et al. Duelling Languages: Grammatical Structure in Codeswitching , 1995 .

[5] Manish Shrivastava,et al. Towards Sub-Word Level Compositions for Sentiment Analysis of Hindi-English Code Mixed Text , 2016, COLING.

[6] Yang Liu,et al. Analyzing language samples of Spanish-English bilingual children for the automated prediction of language dominance , 2011, Nat. Lang. Eng..

[7] Niloy Ganguly,et al. Understanding Language Preference for Expression of Opinion and Sentiment: What do Hindi-English Speakers do on Twitter? , 2016, EMNLP.

[8] Bing Liu,et al. Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[9] Kamal Sarkar,et al. JU_KS@SAIL_CodeMixed-2017: Sentiment Analysis for Indian Code Mixed Social Media Texts , 2018, ArXiv.

[10] Tanmoy Chakraborty,et al. SemEval-2020 Task 9: Overview of Sentiment Analysis of Code-Mixed Tweets , 2020, SEMEVAL.

[11] Prathyusha Danda,et al. Code-Mixed Sentiment Analysis Using Machine Learning and Neural Network Approaches , 2018, ArXiv.