Exploiting Multilingual Neural Linguistic Representation for Sentiment Classification of Political Tweets in Code-mix Language

Social media is more and more utilized by people around the world to express their feelings and opinions in the kind of short text messages. Twitter has been a rapidly growing microblogging social networking website where people express their opinions in a precise and simple manner of expressions. It has also become a platform for governmental, non-governmental and individual opinions and policy announcements. Detecting sentiments from tweets has a wide range of applications including identifying the anxiety or depression of individuals and measuring the well-being or mood of a community. In addition, the sentiment classification becomes complex when the tweet is written in codemix language which is a mix of two different languages. The main objective of this paper is to classify tweets written in mix of Tamil and English language into positive, negative, or neutral. This is achieved by fine tuning a pretrained multilingual text representation model as well as deep learning transformers. The proposed approach is experimented with large scale of tweets collected for societal issues in India. We also provide a comparative study of different machine learning and deep learning models. The proposed architecture based on neural linguistic representation provides significant accuracy in classifying both Tamil and codemix tweets.

[1]  John P. McCrae,et al.  Corpus Creation for Sentiment Analysis in Code-Mixed Tamil-English Text , 2020, SLTU.

[2]  Matthew A. Russell,et al.  Mining the social web , 2011 .

[3]  Carlo Strapparava,et al.  Learning to identify emotions in text , 2008, SAC '08.

[4]  Shital Anil Phand,et al.  Twitter sentiment classification using stanford NLP , 2017, 2017 1st International Conference on Intelligent Systems and Information Management (ICISIM).

[5]  S. Anbukkarasi,et al.  Analyzing Sentiment in Tamil Tweets using Deep Neural Network , 2020, 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC).

[6]  Sinnathamby Mahesan,et al.  Review on sentiment analysis in Tamil texts , 2018 .

[7]  Jitendra Kumar,et al.  Sentiment mining: An approach for Bengali and Tamil tweets , 2016, 2016 Ninth International Conference on Contemporary Computing (IC3).

[8]  Deepali Deshpande,et al.  Twitter Sentiment Analysis System , 2018, International Journal of Computer Applications.

[9]  S. Raghunathan,et al.  CORPUS BASED SENTIMENT CLASSIFICATION OF TAMIL MOVIE TWEETS USING SYNTACTIC PATTERNS , 2017 .

[10]  Firoj Alam,et al.  Graph Based Semi-supervised Learning with Convolution Neural Networks to Classify Crisis Related Tweets , 2018, ICWSM.

[11]  Massimo Esposito,et al.  An Effective BERT-Based Pipeline for Twitter Sentiment Analysis: A Case Study in Italian , 2020, Sensors.