BERT Based Hierarchical Sequence Classification for Context-Aware Microblog Sentiment Analysis

In microblog sentiment analysis task, most of the existing algorithms treat each microblog isolatedly. However, in many cases, the sentiments of microblogs can be ambiguous and context-dependent, such as microblogs in an ironic tone or non-sentimental contents conveying certain emotional tendency. In this paper, we consider the context-aware sentiment analysis as a sequence classification task, and propose a Bidirectional Encoder Representation from Transformers (BERT) based hierarchical sequence classification model. Our proposed model extends BERT pre-trained model, which is powerful of dependency learning and semantic information extracting, with Bidirectional Long Short Term Memory (BiLSTM) and Conditional Random Field (CRF) layers. Fine-tuning such a model on the sequence classification task enables the model to jointly consider the representation with the contextual information and the transition between adjacent microblogs. Experimental evaluations on a public context-aware dataset show that the proposed model can outperform other reported methods by a large margin.

[1]  Hongyu Guo,et al.  Long Short-Term Memory Over Recursive Structures , 2015, ICML.

[2]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[3]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[4]  Mike Wells,et al.  Structured Models for Fine-to-Coarse Sentiment Analysis , 2007, ACL.

[5]  Yoshua Bengio,et al.  Hierarchical Recurrent Neural Networks for Long-Term Dependencies , 1995, NIPS.

[6]  Xin Zhang,et al.  Exploring sentiment parsing of microblogging texts for opinion polling on chinese public figures , 2016, Applied Intelligence.

[7]  Daling Wang,et al.  Attention based hierarchical LSTM network for context-aware microblog sentiment classification , 2018, World Wide Web.

[8]  Jürgen Schmidhuber,et al.  Sequence Labelling in Structured Domains with Hierarchical Recurrent Neural Networks , 2007, IJCAI.

[9]  Daling Wang,et al.  Context-Aware Chinese Microblog Sentiment Classification with Bidirectional LSTM , 2016, APWeb.

[10]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[13]  Yueting Zhuang,et al.  Microblog Sentiment Classification via Recurrent Random Walk Network Learning , 2017, IJCAI.

[14]  Fangzhao Wu,et al.  Microblog Sentiment Classification with Contextual Knowledge Regularization , 2015, AAAI.

[15]  Rico Sennrich,et al.  Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures , 2018, EMNLP.

[16]  Zhiyuan Liu,et al.  Neural Sentiment Classification with User and Product Attention , 2016, EMNLP.

[17]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[18]  Yue Zhang,et al.  Context-Sensitive Twitter Sentiment Classification Using Neural Network , 2016, AAAI.

[19]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.