Self-Attention-Based BiLSTM Model for Short Text Fine-Grained Sentiment Classification

Fine-grained sentiment polarity classification for short texts has been an important and challenging task in natural language processing until these years. The short texts may contain multiple aspect-terms, opinion terms expressing different sentiments for different aspect-terms. The polarity of the whole sentence is highly correlated with the aspect-terms and opinion terms. Besides, there are two challenges, which are how to effectively use the contextual information and the semantic features, and how to model the correlations between aspect-terms and context words including opinion terms. To solve these problems, a Self-Attention-Based BiLSTM model with aspect-term information is proposed for the fine-grained sentiment polarity classification for short texts. The proposed model can effectively use contextual information and semantic features, and especially model the correlations between aspect-terms and context words. The model mainly consists of a word-encode layer, a BiLSTM layer, a self-attention layer and a softmax layer. Among them, the BiLSTM layer sums up the information from two opposite directions of a sentence through two independent LSTMs. The self-attention layer captures the more important parts of a sentence when different aspect-terms are input. Between the BiLSTM layer and the self-attention layer, the hidden vector and the aspect-term vector are fused by adding, which reduces the computational complexity caused by the vector splicing directly. The experiments on public Restaurant and Laptop corpus from the SemEval 2014 Task 4, and Twitter corpus from the ACL 14. The Friedman and Nemenyi tests are used in the comparison study. Compared with existing methods, experimental results demonstrate that the proposed model is feasible and efficient.

[1]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[2]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[3]  Daniel Jurafsky,et al.  A Hierarchical Neural Autoencoder for Paragraphs and Documents , 2015, ACL.

[4]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[5]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[6]  Bing Liu Sentiment Analysis , 2020 .

[7]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[8]  Klaus-Robert Müller,et al.  Explaining Recurrent Neural Network Predictions in Sentiment Analysis , 2017, WASSA@EMNLP.

[9]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[10]  Xiaocheng Feng,et al.  Target-Dependent Sentiment Classification with Long Short Term Memory , 2015, ArXiv.

[11]  Ting Liu,et al.  Aspect Level Sentiment Classification with Deep Memory Network , 2016, EMNLP.

[12]  Houfeng Wang,et al.  Interactive Attention Networks for Aspect-Level Sentiment Classification , 2017, IJCAI.

[13]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[14]  M. Perc,et al.  Doubly effects of information sharing on interdependent network reciprocity , 2018, New Journal of Physics.

[15]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[16]  Ahlam Ansari,et al.  Intelligent question answering system based on Artificial Neural Network , 2016, 2016 IEEE International Conference on Engineering and Technology (ICETECH).

[17]  Yue Zhang,et al.  Context-Sensitive Twitter Sentiment Classification Using Neural Network , 2016, AAAI.

[18]  Bowen Zhou,et al.  ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs , 2015, TACL.

[19]  Li Zhao,et al.  Attention-based LSTM for Aspect-level Sentiment Classification , 2016, EMNLP.

[20]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[21]  Preslav Nakov,et al.  SemEval-2016 Task 4: Sentiment Analysis in Twitter , 2016, *SEMEVAL.

[22]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[23]  Christopher J. Lowrance,et al.  Effect of training set size on SVM and Naive Bayes for Twitter sentiment analysis , 2015, 2015 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).

[24]  Hermann Ney,et al.  LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.

[25]  Matjaz Perc,et al.  The Matthew effect in empirical data , 2014, Journal of The Royal Society Interface.

[26]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[27]  Matjaz Perc,et al.  Self-organization of progress across the century of physics , 2013, Scientific Reports.

[28]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[29]  Ming Zhou,et al.  Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification , 2014, ACL.

[30]  Hinrich Schütze,et al.  Erratum: “ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs” , 2016, Transactions of the Association for Computational Linguistics.

[31]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[32]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[33]  Xianghan Zheng,et al.  Deep Sentiment Representation Based on CNN and LSTM , 2017, 2017 International Conference on Green Informatics (ICGI).

[34]  Jiajun Zhang,et al.  Deep Neural Networks in Machine Translation: An Overview , 2015, IEEE Intelligent Systems.

[35]  Matjaz Perc,et al.  Inheritance patterns in citation networks reveal scientific memes , 2014, ArXiv.

[36]  Pei Li,et al.  Sentiment Classification of Chinese Microblogging Texts with Global RNN , 2016, 2016 IEEE First International Conference on Data Science in Cyberspace (DSC).