Hybrid Words Representation for Airlines Sentiment Analysis

Social media sentimental analysis is interesting field with the aim to analyze social conservation and determine deeper context as they apply to a topic or theme. However, it is challenging as tweets are unstructured, informal and noisy in nature. Also, it involves natural language complexities like words with same meanings (Polysemy). Most of the existing approaches mainly rely on clean textual data, however Twitter data is quite noisy in real life. Aiming to improve the performance, in this paper, we present hybrid words representation and Bi-directional Long Short Term Memory (BiLSTM) with attention modeling resulting in improvement in tweet quality by not only treating the noise within the textual context but also considers polysemy, semantics, syntax, out of vocabulary (OOV) words as well as words sentiments within a tweet. The proposed model overcomes the current limitations and improves the accuracy for tweets classification as showed by the evaluation of the model performed on real-world airline related datasets.

[1]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[2]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[3]  Cícero Nogueira dos Santos,et al.  Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts , 2014, COLING.

[4]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[5]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[6]  Katarzyna Musial,et al.  DICE: Deep Intelligent Contextual Embedding for Twitter Sentiment Analysis , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[7]  Guandong Xu,et al.  Event Detection in Twitter Stream using Weighted Dynamic Heartbeat Graph Approach , 2019, IEEE Comput. Intell. Mag..

[8]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[9]  Saif Mohammad,et al.  A Practical Guide to Sentiment Annotation: Challenges and Solutions , 2016, WASSA@NAACL-HLT.

[10]  Laura Ricci,et al.  EDITORIAL - Special Issue on Large Scale Cooperative Virtual Environments , 2019, J. Grid Comput..

[11]  Ido Dagan,et al.  context2vec: Learning Generic Context Embedding with Bidirectional LSTM , 2016, CoNLL.

[12]  Xuejie Zhang,et al.  Refining Word Embeddings Using Intensity Scores for Sentiment Analysis , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[13]  Guandong Xu,et al.  What’s Happening Around the World? A Survey and Framework on Event Detection Techniques on Twitter , 2019, Journal of Grid Computing.

[14]  Ming Zhou,et al.  Sentiment Embeddings with Applications to Sentiment Analysis , 2016, IEEE Transactions on Knowledge and Data Engineering.

[15]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[16]  Estevam R. Hruschka,et al.  Tweet sentiment analysis with classifier ensembles , 2014, Decis. Support Syst..

[17]  Guandong Xu,et al.  Text Stream to Temporal Network - A Dynamic Heartbeat Graph to Detect Emerging Events on Twitter , 2018, PAKDD.

[18]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[19]  Erik Cambria,et al.  SenticNet 5: Discovering Conceptual Primitives for Sentiment Analysis by Means of Context Embeddings , 2018, AAAI.

[20]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[21]  Gui Xiaolin,et al.  Deep Convolution Neural Networks for Twitter Sentiment Analysis , 2018, IEEE Access.

[22]  Ali Ghodsi,et al.  Improving the Accuracy of Pre-trained Word Embeddings for Sentiment Analysis , 2017, ArXiv.

[23]  Franco Chiavetta,et al.  A Lexicon-based Approach for Sentiment Classification of Amazon Books Reviews in Italian Language , 2016, WEBIST.

[24]  Xuanjing Huang,et al.  Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model , 2015, IJCAI.

[25]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[26]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[27]  Thorsten Brants,et al.  One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.

[28]  Guandong Xu,et al.  Enhanced Heartbeat Graph for emerging event detection on Twitter using time series networks , 2019, Expert Syst. Appl..

[29]  Saif Mohammad,et al.  NRC-Canada-2014: Detecting Aspects and Sentiment in Customer Reviews , 2014, *SEMEVAL.