A fusion model for multi-label emotion classification based on BERT and topic clustering

As one of the most critical tasks of natural language processing (NLP), emotion classification has a wide range of applications in many fields. However, restricted by corpus, semantic ambiguity, and other constraints, researchers in emotion classification face many difficulties, and the accuracy of multi-label emotion classification is not ideal. In this paper, to improve the accuracy of multi-label emotion classification, especially when semantic ambiguity occurs, we proposed a fusion model for text based on self-attention and topic clustering. We use the Pre-trained BERT to extract the hidden emotional representations of the sentence, and use the improved LDA topic model to cluster the topics of different levels of text. Then we fuse the hidden representations of the sentence and use a classification neural network to calculate the multi-label emotional intensity of the sentence. After testing on the Chinese emotion corpus Ren_CECPs corpus, extensive experimental results demonstrate that our model outperforms several strong baselines and related works. The F1-score of our model reaches 0.484, which is 0.064 higher than the best results in similar studies.

[1]  Jesse Vig,et al.  A Multiscale Visualization of Attention in the Transformer Model , 2019, ACL.

[2]  Fuji Ren,et al.  Predicting User-Topic Opinions in Twitter with Social and Topical Context , 2013, IEEE Transactions on Affective Computing.

[3]  Huang Zou,et al.  Sentiment Classification Using Machine Learning Techniques with Syntax Features , 2015, 2015 International Conference on Computational Science and Computational Intelligence (CSCI).

[4]  Alice H. Oh,et al.  Aspect and sentiment unification model for online review analysis , 2011, WSDM '11.

[5]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[6]  Fuji Ren,et al.  Background Knowledge Based Multi-Stream Neural Network for Text Classification , 2018, Applied Sciences.

[7]  Ying Chen,et al.  An Emotion Cause Corpus for Chinese Microblogs with Multiple-User Structures , 2017, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[8]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[9]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[10]  Fuji Ren,et al.  Semi-Automatic Creation of Youth Slang Corpus and Its Application to Affective Computing , 2016, IEEE Transactions on Affective Computing.

[11]  Osmar R. Zaïane,et al.  ANA at SemEval-2019 Task 3: Contextual Emotion detection in Conversations through hierarchical LSTMs and BERT , 2019, *SEMEVAL.

[12]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[13]  Peng Lin,et al.  Public-Opinion Sentiment Analysis for Large Hydro Projects , 2016 .

[14]  Hong Mo,et al.  Linguistic dynamic modeling and analysis of psychological health state using interval type-2 fuzzy sets , 2015, IEEE/CAA Journal of Automatica Sinica.

[15]  Changqin Quan,et al.  Weighted high-order hidden Markov models for compound emotions recognition in text , 2016, Inf. Sci..

[16]  Günther Palm,et al.  A generic framework for the inference of user states in human computer interaction , 2012, Journal on Multimodal User Interfaces.

[17]  Fuji Ren,et al.  Exploring latent semantic information for textual emotion recognition in blog articles , 2018, IEEE/CAA Journal of Automatica Sinica.

[18]  Erik Cambria,et al.  Sentic LDA: Improving on LDA with semantic similarity for aspect-based sentiment analysis , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[19]  Xuanjing Huang,et al.  How to Fine-Tune BERT for Text Classification? , 2019, CCL.

[20]  Sharath Chandra Guntuku,et al.  Exploring (Dis-)Similarities in Emoji-Emotion Association on Twitter and Weibo , 2019, WWW.

[21]  Jianfei Yu,et al.  Improving Multi-label Emotion Classification via Sentiment Classification with Dual Attention Transfer Network , 2018, EMNLP.

[22]  Changqin Quan,et al.  A blog emotion corpus for emotional expression analysis in Chinese , 2010, Comput. Speech Lang..

[23]  Fuji Ren,et al.  Emotion computing using Word Mover’s Distance features based on Ren_CECps , 2018, PloS one.

[24]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..