Learning Sentiment-Specific Word Embedding via Global Sentiment Representation

Context-based word embedding learning approaches can model rich semantic and syntactic information. However, it is problematic for sentiment analysis because the words with similar contexts but opposite sentiment polarities, such as good and bad, are mapped into close word vectors in the embedding space. Recently, some sentiment embedding learning methods have been proposed, but most of them are designed to work well on sentence-level texts. Directly applying those models to document-level texts often leads to unsatisfied results. To address this issue, we present a sentimentspecific word embedding learning architecture that utilizes local context information as well as global sentiment representation. The architecture is applicable for both sentencelevel and document-level texts. We take global sentiment representation as a simple average of word embeddings in the text, and use a corruption strategy as a sentiment-dependent regularization. Extensive experiments conducted on several benchmark datasets demonstrate that the proposed architecture outperforms the state-of-the-art methods for sentiment classification.

[1]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[2]  Yi Zheng,et al.  Weakly-Supervised Deep Learning for Customer Review Sentiment Classification , 2016, IJCAI.

[3]  Yang Liu,et al.  Visualizing and Understanding Neural Machine Translation , 2017, ACL.

[4]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[5]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[6]  Ming Zhou,et al.  Sentiment Embeddings with Applications to Sentiment Analysis , 2016, IEEE Transactions on Knowledge and Data Engineering.

[7]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[9]  Wenpeng Yin,et al.  Multichannel Variable-Size Convolution for Sentence Classification , 2015, CoNLL.

[10]  Hod Lipson,et al.  Re-embedding words , 2013, ACL.

[11]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[12]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[13]  Minmin Chen,et al.  Efficient Vector Representation for Documents through Corruption , 2017, ICLR.

[14]  Xin Wang,et al.  Predicting Polarities of Tweets by Composing Word Embeddings with Long Short-Term Memory , 2015, ACL.

[15]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[16]  Quoc V. Le,et al.  Document Embedding with Paragraph Vectors , 2015, ArXiv.

[17]  Hua Wu,et al.  An End-to-End Model for Question Answering over Knowledge Base with Cross-Attention Combining Global Knowledge , 2017, ACL.

[18]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[19]  Zhiyuan Liu,et al.  Improving Word Representations with Document Labels , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[20]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[21]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[22]  Xiaocheng Feng,et al.  Effective LSTMs for Target-Dependent Sentiment Classification , 2015, COLING.

[23]  Preslav Nakov,et al.  SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[24]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[25]  Jeffrey Pennington,et al.  Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.

[26]  Zhihua Zhang,et al.  Learning sentiment-inherent word embedding for word-level and sentence-level sentiment analysis , 2015, 2015 International Conference on Asian Language Processing (IALP).

[27]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[28]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[29]  Dai Quoc Nguyen,et al.  Sentiment Classification on Polarity Reviews: An Empirical Study Using Rating-based Features , 2014, WASSA@ACL.