An Integrated Word Embedding-Based Dual-Task Learning Method for Sentiment Analysis

Sentiment analysis aimed to automate the task of discriminating the sentiment tendency of a textual review, which expresses a simple sentiment as positive, negative, or neutral. In general, the basic sentiment analysis solution used for feature extraction is the word embedding technique, which only focuses on the contextual or global semantic information and ignores the sentiment polarity of text. Thus, the word embedding technique leads to biased analysis results, especially for some words that have the same semantic context but an opposite sentiment. In this paper, we propose an integrated sentiment embedding method to combine context and sentiment information using a dual-task learning algorithm to perform sentiment analysis. First, we propose three sentiment language models by encoding the sentiment information of texts into word embedding based on three existing semantic models, namely, continuous bag-of-words, prediction, and log-bilinear. Next, based on semantic language models and the proposed sentiment language models, we propose a dual-task learning algorithm to generate hybrid word embedding named integrated sentiment embedding, in which the joint learning method and parallel learning method are applied to jointly process tasks. Experiments on sentence-level and document-level sentiment classification tasks demonstrate that the proposed integrated sentiment embedding has better classification performances compared with basic word embedding methods.

[1]  Geoffrey E. Hinton,et al.  Three new graphical models for statistical language modelling , 2007, ICML '07.

[2]  Jun Zhao,et al.  How to Generate a Good Word Embedding , 2015, IEEE Intelligent Systems.

[3]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[4]  Yongdong Zhang,et al.  Multi-task deep visual-semantic embedding for video thumbnail selection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[6]  Mahmoud Al-Ayyoub,et al.  Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews , 2018, International Journal of Machine Learning and Cybernetics.

[7]  Noah Constant,et al.  Character-Level Language Modeling with Deeper Self-Attention , 2018, AAAI.

[8]  J.R. Bellegarda,et al.  Exploiting latent semantic information in statistical language modeling , 2000, Proceedings of the IEEE.

[9]  Yoshua Bengio,et al.  Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.

[10]  Felipe Maia Galvão França,et al.  Multilingual part-of-speech tagging with weightless neural networks , 2015, Neural Networks.

[11]  Amitava Das,et al.  Part-of-Speech Tagging for Code-Mixed English-Hindi Twitter and Facebook Chat Messages , 2015, RANLP.

[12]  Gobinda G. Chowdhury,et al.  Natural language processing , 2005, Annu. Rev. Inf. Sci. Technol..

[13]  Erik Cambria,et al.  Recent Trends in Deep Learning Based Natural Language Processing , 2017, IEEE Comput. Intell. Mag..

[14]  A. K. Rigler,et al.  Accelerating the convergence of the back-propagation method , 1988, Biological Cybernetics.

[15]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[16]  Bowen Zhou,et al.  Neural Models for Sequence Chunking , 2017, AAAI.

[17]  Yorick Wilks,et al.  A Closer Look at Skip-gram Modelling , 2006, LREC.

[18]  Lukás Burget,et al.  Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  Jutta L. Mueller,et al.  Oscillatory EEG dynamics underlying automatic chunking during sentence processing , 2017, NeuroImage.

[20]  Geoffrey Zweig,et al.  Context dependent recurrent neural network language model , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[21]  Yanjun Qi,et al.  Sentiment classification based on supervised latent n-gram analysis , 2011, CIKM '11.

[22]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[23]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[24]  Xiaoou Tang,et al.  Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[25]  Minyi Guo,et al.  Emoticon Smoothed Language Models for Twitter Sentiment Analysis , 2012, AAAI.

[26]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[27]  Miguel A. Alonso,et al.  Sentiment Analysis on Monolingual, Multilingual and Code-Switching Twitter Corpora , 2015, WASSA@EMNLP.

[28]  Karl Maton,et al.  Making semantic waves: A key to cumulative knowledge-building , 2013 .

[29]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[30]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[31]  Rama Chellappa,et al.  HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[33]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[34]  Omer Levy,et al.  word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method , 2014, ArXiv.

[35]  Harry Chen,et al.  Semantic Web in the context broker architecture , 2004, Second IEEE Annual Conference on Pervasive Computing and Communications, 2004. Proceedings of the.

[36]  María I. Viedma-del Jesús,et al.  A survey of multilingual human-tagged short message datasets for sentiment analysis tasks , 2017, Soft Computing.

[37]  Jian Ma,et al.  Learning-Based Energy-Efficient Data Collection by Unmanned Vehicles in Smart Cities , 2018, IEEE Transactions on Industrial Informatics.

[38]  Jerome R. Bellegarda,et al.  Exploiting both local and global constraints for multi-span statistical language modeling , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[39]  Pierre Baldi,et al.  Learning Activation Functions to Improve Deep Neural Networks , 2014, ICLR.

[40]  B. Hannover,et al.  The semantic--procedural interface model of the self: the role of self-knowledge for context-dependent versus context-independent modes of thinking. , 2001 .

[41]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[42]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[43]  Ming Zhou,et al.  Coooolll: A Deep Learning System for Twitter Sentiment Classification , 2014, *SEMEVAL.

[44]  Lukás Burget,et al.  Recurrent Neural Network Based Language Modeling in Meeting Recognition , 2011, INTERSPEECH.

[45]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[46]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[47]  Erik Cambria,et al.  Affective Computing and Sentiment Analysis , 2016, IEEE Intelligent Systems.

[48]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[49]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[50]  Mahmoud Al-Ayyoub,et al.  Arabic sentiment analysis: Lexicon-based and corpus-based , 2013, 2013 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT).

[51]  Ming Zhou,et al.  Sentiment Embeddings with Applications to Sentiment Analysis , 2016, IEEE Transactions on Knowledge and Data Engineering.

[52]  Erik Cambria,et al.  Targeted Aspect-Based Sentiment Analysis via Embedding Commonsense Knowledge into an Attentive LSTM , 2018, AAAI.

[53]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.