Contextual sentiment embeddings via bi-directional GRU language model

Abstract Compared with conventional word embeddings, sentiment embeddings can distinguish words with similar contexts but opposite sentiment. They can be used to incorporate sentiment information from labeled corpora or lexicons by either end-to-end training or sentiment refinement. However, these methods present two major limitations. First, traditional approaches provide a fixed representation to each word but ignore the alternation of word meaning in different contexts. As a result, the polarity of a certain emotional word may vary with context, but will be assigned with a same representation. Another problem is the handling of out-of-vocabulary (OOV) or informal-writing sentiment words that would be assigned generic vectors (e.g., ). In addition, if affective words are not included in affective corpora or lexicons, they would be treated as neutral. Using such low-quality embeddings for building a neural model will reduce performance. This study proposes a training model of contextual sentiment embeddings. A stacked two-layer GRU model was used as the language model, simultaneously trained to incorporate semantic and sentiment information from labeled corpora and lexicons. To deal with OOV or informal-writing sentiment words, the WordPiece tokenizer was used to divide the text into subwords. The resulting model can be transferred to downstream applications by either feature extractor or fine-tuning. The results show that the proposed model can handle unseen or informal writing sentiment words and thus outperforms previously proposed methods.

[1]  Preslav Nakov,et al.  SemEval-2014 Task 9: Sentiment Analysis in Twitter , 2014, *SEMEVAL.

[2]  Ido Dagan,et al.  context2vec: Learning Generic Context Embedding with Bidirectional LSTM , 2016, CoNLL.

[3]  Omer Levy,et al.  Dependency-Based Word Embeddings , 2014, ACL.

[4]  Graeme Hirst,et al.  Computing Lexical Contrast , 2013, CL.

[5]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[6]  Liang-Chih Yu,et al.  Tree-Structured Regional CNN-LSTM Model for Dimensional Sentiment Analysis , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[7]  Preslav Nakov,et al.  SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[8]  Roberto Navigli,et al.  Neural Sequence Learning Models for Word Sense Disambiguation , 2017, EMNLP.

[9]  Zhiyuan Liu,et al.  Joint Learning of Character and Word Embeddings , 2015, IJCAI.

[10]  Erik Cambria,et al.  ABCDM: An Attention-based Bidirectional CNN-RNN Deep Model for sentiment analysis , 2021, Future Gener. Comput. Syst..

[11]  Gang Wang,et al.  RC-NET: A General Framework for Incorporating Knowledge into Word Representations , 2014, CIKM.

[12]  Erik Cambria,et al.  SenticNet 6: Ensemble Application of Symbolic and Subsymbolic AI for Sentiment Analysis , 2020, CIKM.

[13]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[14]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[15]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[16]  Yue Zhang,et al.  Improving Twitter Sentiment Classification Using Topic-Enriched Multi-Prototype Word Embeddings , 2016, AAAI.

[17]  K. Robert Lai,et al.  Refining Word Embeddings for Sentiment Analysis , 2017, EMNLP.

[18]  Bo Peng,et al.  Adversarial learning of sentiment word representations for sentiment analysis , 2020, Inf. Sci..

[19]  Guodong Zhou,et al.  Active Learning for Cross-domain Sentiment Classification , 2013, IJCAI.

[20]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[21]  Akira Utsumi Refining Pretrained Word Embeddings Using Layer-wise Relevance Propagation , 2018, EMNLP.

[22]  Chao Wu,et al.  Multiple-element joint detection for Aspect-Based Sentiment Analysis , 2021, Knowl. Based Syst..

[23]  Marouane Birjali,et al.  A comprehensive survey on sentiment analysis: Approaches, challenges and trends , 2021, Knowl. Based Syst..

[24]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[25]  Ao Feng,et al.  Target-Dependent Sentiment Classification With BERT , 2019, IEEE Access.

[26]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[27]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[28]  Aijun An,et al.  Learning Emotion-enriched Word Representations , 2018, COLING.

[29]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[30]  Amy Beth Warriner,et al.  Norms of valence, arousal, and dominance for 13,915 English lemmas , 2013, Behavior Research Methods.

[31]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[32]  You Zhang,et al.  Personalized sentiment classification of customer reviews via an interactive attributes attention model , 2021, Knowl. Based Syst..

[33]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[34]  Zheng Lin,et al.  Learning Sentiment-Specific Word Embedding via Global Sentiment Representation , 2018, AAAI.

[35]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[36]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[37]  Ignacio Iacobacci,et al.  Embeddings for Word Sense Disambiguation: An Evaluation Study , 2016, ACL.

[38]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[39]  Dipankar Das,et al.  A Practical Guide to Sentiment Analysis , 2017 .

[40]  Zhihua Zhang,et al.  Three Convolutional Neural Network-based models for learning Sentiment Word Vectors towards sentiment analysis , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[41]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[42]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[43]  Roberto Navigli,et al.  Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison , 2017, EACL.

[44]  Iyad Rahwan,et al.  Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm , 2017, EMNLP.

[45]  Qing Liu,et al.  Enhancing BERT Representation With Context-Aware Embedding for Aspect-Based Sentiment Analysis , 2020, IEEE Access.

[46]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[47]  Yann LeCun,et al.  Very Deep Convolutional Networks for Text Classification , 2016, EACL.

[48]  Xuejie Zhang,et al.  Refining Word Embeddings Using Intensity Scores for Sentiment Analysis , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[49]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[50]  Asif Ekbal,et al.  How Intense Are You? Predicting Intensities of Emotions and Sentiments using Stacked Ensemble [Application Notes] , 2020, IEEE Comput. Intell. Mag..

[51]  Graeme Hirst,et al.  Enriching Word Embeddings with a Regressor Instead of Labeled Corpora , 2019, AAAI.

[52]  K. Robert Lai,et al.  Dimensional Sentiment Analysis Using a Regional CNN-LSTM Model , 2016, ACL.

[53]  Ming Zhou,et al.  Sentiment Embeddings with Applications to Sentiment Analysis , 2016, IEEE Transactions on Knowledge and Data Engineering.

[54]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[55]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.