LSTM Based Semi-Supervised Attention Framework for Sentiment Analysis

With the rapid development of Internet technology and social media, people are accustomed to making comments on the Internet. Sentiment analysis, as an efficient technique, has been used by researchers in the tasks of analysing the sentiment polarity under these comments. To better achieve this target, the fundamental challenge is how to extract the feature and build a proper mechanism to learn them. A lot of word embedding based deep learning models for sentiment analysis are proposed in the literature. And the semi-supervised learning methods make it possible to use both labelled and unlabelled data for this kind of task. Furthermore, the attention mechanism proposed in recent years has achieved great accomplishments for natural language processing (NLP) tasks since it helps to capture the important information of the documents. In this paper, inspired by these works, we proposed a long short term memory (LSTM) based semi-supervised attention framework for sentiment analysis tasks, which is composed of an unsupervised attention based LSTM encoder-decoder and an attention based supervised LSTM model attached by a Softmax layer. The unsupervised part worked for attaining the high dimensional representation of the documents, and the supervised part extracted feature and enhanced the important parts for classification. Experimental study on commonly used datasets has demonstrated its ability for sentiment analysis tasks.

[1]  Alex Graves,et al.  Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.

[2]  Yi Yang,et al.  Bag-of-Discriminative-Words (BoDW) Representation via Topic Modeling , 2017, IEEE Transactions on Knowledge and Data Engineering.

[3]  Aitor García Pablos,et al.  W2VLDA: Almost unsupervised system for Aspect Based Sentiment Analysis , 2017, Expert Syst. Appl..

[4]  Lijuan Duan,et al.  Feature Extraction of Motor Imagery EEG Based on Extreme Learning Machine Auto-encoder , 2016 .

[5]  Li Zhao,et al.  Attention-based LSTM for Aspect-level Sentiment Classification , 2016, EMNLP.

[6]  Xiaolong Wang,et al.  HITSZ-ICRC: Exploiting Classification Approach for Answer Selection in Community Question Answering , 2015, *SEMEVAL.

[7]  Zhang Xiong,et al.  Attention Aware Semi-supervised Framework for Sentiment Analysis , 2017, ICANN.

[8]  Abeed Sarker,et al.  Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features , 2015, J. Am. Medical Informatics Assoc..

[9]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[10]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Björn W. Schuller,et al.  Contextual Bidirectional Long Short-Term Memory Recurrent Neural Network Language Models: A Generative Approach to Sentiment Analysis , 2017, EACL.

[12]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[13]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[14]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[15]  Yoshua Bengio,et al.  End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results , 2014, ArXiv.

[16]  Steve Majerus,et al.  The Dorsal Attention Network Reflects Both Encoding Load and Top–down Control during Working Memory , 2018, Journal of Cognitive Neuroscience.

[17]  Björn W. Schuller,et al.  Convolutional RNN: An enhanced model for extracting features from sequential data , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[18]  Yangyang Shi,et al.  Contextual spoken language understanding using recurrent neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[21]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[22]  Kentaro Inui,et al.  Dependency Tree-based Sentiment Classification using CRFs with Hidden Variables , 2010, NAACL.

[23]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[24]  Claire Cardie,et al.  39. Opinion mining and sentiment analysis , 2014 .

[25]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[26]  Chao Li,et al.  Structural information aware deep semi-supervised recurrent neural network for sentiment analysis , 2015, Frontiers of Computer Science.

[27]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[28]  Li Li,et al.  Emphasizing Essential Words for Sentiment Classification Based on Recurrent Neural Networks , 2017, Journal of Computer Science and Technology.

[29]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[30]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[31]  Mohammad Abid Khan,et al.  Urdu Sentiment Analysis Using Supervised Machine Learning Approach , 2018, Int. J. Pattern Recognit. Artif. Intell..