Identifying Sentiment of Malayalam Tweets Using Deep Learning

The current chapter focus on providing a comparative study for identifying sentiment of Malayalam tweets using deep learning methods such as convolutional neural net (CNN), long short-term memory units (LSTM). The baseline methods used to compare are support vector machines (SVM), regularized least square classification with random kitchen sink mapping (RKS-RLSC). Malayalam is a low resource language spoken in Kerala state, India. Due to the unavailability of data, tweets were collected and labeled manually based on its polarity as neutral, negative and positive. RKS mapping is a well explored approach in which data are nonlinearly mapped to higher dimension where linear classifier can be used. The evaluation measure chosen for the experiments are F1-score, recall, accuracy and precision. The experiments also provide a comparison with classical methods such as logistic regression (LR), adaboost (Ab), random forest (RF), decision tree (DT), k-nearest neighbor (KNN) on the basis of accuracy as the measure. For the experiments using CNN and LSTM, we report the effectiveness of activation functions such as rectified linear units (ReLU), exponential linear units (ELU) and scaled exponential linear units (SELU) for the sentiment identification of Malayalam tweets over SVM and RKS-RLSC.

[1]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[2]  Harith Alani,et al.  Semantic Sentiment Analysis of Twitter , 2012, SEMWEB.

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[5]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  Sivaji Bandyopadhyay,et al.  SentiWordNet for Indian Languages , 2010 .

[8]  Gunjan Ansari,et al.  Sentiment Analysis in Hindi Language : A Survey , 2014 .

[9]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[10]  Lorenzo Rosasco,et al.  GURLS: a least squares library for supervised learning , 2013, J. Mach. Learn. Res..

[11]  Takashi Inui,et al.  Extracting Semantic Orientations of Words using Spin Model , 2005, ACL.

[12]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[13]  Sharvari Govilkar,et al.  A Framework for Sentiment Analysis in Hindi using HSWN , 2015 .

[14]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[15]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[16]  Hsin-Hsi Chen,et al.  Building Emotion Lexicon from Weblog Corpora , 2007, ACL.

[17]  Pushpak Bhattacharyya,et al.  A Fall-back Strategy for Sentiment Analysis in Hindi: a Case Study , 2010 .

[18]  K. P. Soman,et al.  Sentiment Analysis of Tweets in Malayalam Using Long Short-Term Memory Units and Convolutional Neural Nets , 2017, MIKE.

[19]  Saif Mohammad,et al.  Sentiment Analysis of Short Informal Texts , 2014, J. Artif. Intell. Res..

[20]  Paul Zikopoulos,et al.  Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data , 2011 .

[21]  Razvan Pascanu,et al.  Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22]  Stan Szpakowicz,et al.  Identifying Expressions of Emotion in Text , 2007, TSD.

[23]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[24]  Rada Mihalcea,et al.  Word Sense and Subjectivity , 2006, ACL.

[25]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[26]  Richa Sharma,et al.  Opinion Mining In Hindi Language: A Survey , 2014, FOCS 2014.

[27]  Leandro Nunes de Castro,et al.  A keyword extraction method from twitter messages represented as graphs , 2014, Appl. Math. Comput..

[28]  B. Premjith,et al.  AMRITA_CEN-NLP@SAIL2015: Sentiment Analysis in Indian Language Using Regularized Least Square Approach with Randomized Feature Learning , 2015, MIKE.

[29]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[30]  N. Mohandas,et al.  Domain Specific Sentence Level Mood Extraction from Malayalam Text , 2012, 2012 International Conference on Advances in Computing and Communications.

[31]  Maite Taboada,et al.  Methods for Creating Semantic Orientation Dictionaries , 2006, LREC.