Adversarial learning of sentiment word representations for sentiment analysis

Abstract Word embeddings are used to represent words as distributed features, which can boost the performance on sentiment analysis tasks. However, most word embeddings consider only semantic and syntactic information and ignore sentiment information. Words with opposite sentiment polarities can have similar word embeddings (e.g., happy and sad or good and bad) as they have similar contexts. For incorporating sentiment information into word vectors, some approaches to sentiment embeddings are proposed. Based on the end-to-end architectures, these methods typically take the sentiment labels of whole sentences as outputs and use them to propagate gradients that update the context word vectors. Therefore, if the polarities of context words are inconsistent, they will still share the same gradient for updating. To address this, we have proposed an adversarial learning method for training sentiment word embeddings, in which the discriminator is employed to force the generator to produce high-quality word embeddings by using semantic and sentiment information. Additionally, the generator applies the multi-head self-attention to re-weight the gradients so that sentiment and semantic information are efficiently captured. Comparative experiments have been conducted with the word- and sentence-level benchmarks. The results demonstrate that the proposed method has outperformed previous sentiment embedding training models.

[1]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[2]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[3]  Ming Zhou,et al.  Sentiment Embeddings with Applications to Sentiment Analysis , 2016, IEEE Transactions on Knowledge and Data Engineering.

[4]  SchmidhuberJürgen Deep learning in neural networks , 2015 .

[5]  Nancy Ide,et al.  Distant Supervision for Emotion Classification with Discrete Binary Values , 2013, CICLing.

[6]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[7]  Dong-Hong Ji,et al.  A topic-enhanced word embedding for Twitter sentiment classification , 2016, Inf. Sci..

[8]  Xuejie Zhang,et al.  Refining Word Embeddings Using Intensity Scores for Sentiment Analysis , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[9]  Liang-Chih Yu,et al.  Tree-Structured Regional CNN-LSTM Model for Dimensional Sentiment Analysis , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[10]  Aapo Hyvärinen,et al.  Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics , 2012, J. Mach. Learn. Res..

[11]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[12]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[15]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[16]  Xuejie Zhang,et al.  Using a stacked residual LSTM model for sentiment intensity prediction , 2018, Neurocomputing.

[17]  Guodong Zhou,et al.  Tree kernel-based semantic relation extraction with rich syntactic and semantic information , 2010, Inf. Sci..

[18]  MATHEUS ARAUJO,et al.  A comparative study of machine translation for multilingual sentence-level sentiment analysis , 2020, Inf. Sci..

[19]  Nikos Pelekis,et al.  DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis , 2017, *SEMEVAL.

[20]  Jiaqi Wang,et al.  Three-way enhanced convolutional neural networks for sentence-level sentiment classification , 2019, Inf. Sci..

[21]  Min-Chul Yang,et al.  Target-aware convolutional neural network for target-level sentiment analysis , 2019, Inf. Sci..

[22]  Kevin Gimpel,et al.  From Paraphrase Database to Compositional Paraphrase Model and Back , 2015, Transactions of the Association for Computational Linguistics.

[23]  Mark Anderson,et al.  Design of Experiments: Statistical Principles of Research Design and Analysis , 2001, Technometrics.

[24]  Amy Beth Warriner,et al.  Norms of valence, arousal, and dominance for 13,915 English lemmas , 2013, Behavior Research Methods.

[25]  Graeme Hirst,et al.  Enriching Word Embeddings with a Regressor Instead of Labeled Corpora , 2019, AAAI.

[26]  Jun Yu,et al.  Local Deep-Feature Alignment for Unsupervised Dimension Reduction , 2018, IEEE Transactions on Image Processing.