Attentional Encoder Network for Targeted Sentiment Classification

Targeted sentiment classification aims at determining the sentimental tendency towards specific targets. Most of the previous approaches model context and target words using recurrent neural networks such as LSTM in conjunction with attention mechanisms. However, LSTM networks are difficult to parallelize because of their sequential nature. Moreover, since full backpropagation over the sequence requires large amounts of memory, essentially every implementation of backpropagation through time is the truncated version, which brings difficulty in remembering long-term patterns. To address these issues, this paper propose an Attentional Encoder Network (AEN) for targeted sentiment classification. Contrary to previous LSTM based works, AEN eschews complex recurrent neural networks and employs attention based encoders for the modeling between context and target, which can excavate the rich introspective and interactive semantic information from the word embeddings without considering the distance between words. This paper also raise the label unreliability issue and introduce label smoothing regularization term to the loss function for encouraging the model to be less confident with the training labels. Experimental results on three benchmark datasets demonstrate that our model achieves comparable or superior performances with a lightweight model size. 1

[1]  Xin Li,et al.  Transformation Networks for Target-Oriented Sentiment Classification , 2018, ACL.

[2]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[3]  Lidong Bing,et al.  Recurrent Attention Network on Memory for Aspect Sentiment Analysis , 2017, EMNLP.

[4]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Houfeng Wang,et al.  Interactive Attention Networks for Aspect-Level Sentiment Classification , 2017, IJCAI.

[6]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[7]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[8]  Li Zhao,et al.  Attention-based LSTM for Aspect-level Sentiment Classification , 2016, EMNLP.

[9]  Suresh Manandhar,et al.  SemEval-2014 Task 4: Aspect Based Sentiment Analysis , 2014, *SEMEVAL.

[10]  Yue Zhang,et al.  Target-Dependent Twitter Sentiment Classification with Rich Automatic Features , 2015, IJCAI.

[11]  Saif Mohammad,et al.  NRC-Canada-2014: Detecting Aspects and Sentiment in Customer Reviews , 2014, *SEMEVAL.

[12]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Vladlen Koltun,et al.  An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.

[15]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[16]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[17]  Ting Liu,et al.  Aspect Level Sentiment Classification with Deep Memory Network , 2016, EMNLP.

[18]  Xiaocheng Feng,et al.  Effective LSTMs for Target-Dependent Sentiment Classification , 2015, COLING.

[19]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[20]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[21]  Masaru Kitsuregawa,et al.  Building Lexicon for Sentiment Analysis from Massive Collection of HTML Documents , 2007, EMNLP.

[22]  Ming Zhou,et al.  Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification , 2014, ACL.

[23]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[24]  Tiejun Zhao,et al.  Target-dependent Twitter Sentiment Classification , 2011, ACL.

[25]  Verónica Pérez-Rosas,et al.  Learning Sentiment Lexicons in Spanish , 2012, LREC.

[26]  Delip Rao,et al.  Semi-Supervised Polarity Lexicon Induction , 2009, EACL.

[27]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.