Exploiting Unsupervised Data for Emotion Recognition in Conversations

Emotion Recognition in Conversations (ERC) aims to predict the emotional state of speakers in conversations, which is essentially a text classification task. Unlike the sentence-level text classification problem, the available supervised data for the ERC task is limited, which potentially prevents the models from playing their maximum effect. In this paper, we propose a novel approach to leverage unsupervised conversation data, which is more accessible. Specifically, we propose the Conversation Completion (ConvCom) task, which attempts to select the correct answer from candidate answers to fill a masked utterance in a conversation. Then, we Pre-train a basic COntext- Dependent Encoder (Pre-CODE) on the ConvCom task. Finally, we fine-tune the Pre-CODE on the datasets of ERC. Experimental results demonstrate that pre-training on unsupervised data achieves significant improvement of performance on the ERC datasets, particularly on the minority emotion classes.

[1]  Michael R. Lyu,et al.  HiGRU: Hierarchical Gated Recurrent Units for Utterance-Level Emotion Recognition , 2019, NAACL.

[2]  Aapo Hyvärinen,et al.  Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[3]  Ying Chen,et al.  Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network , 2018, ACL.

[4]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[5]  Eduard Hovy,et al.  Emotion Recognition in Conversation: Research Challenges, Datasets, and Recent Advances , 2019, IEEE Access.

[6]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[7]  Louis-Philippe Morency,et al.  Combating Human Trafficking with Multimodal Deep Models , 2017, ACL.

[8]  Sopan Khosla,et al.  EmotionX-AR: CNN-DCNN autoencoder based Emotion Classifier , 2018, SocialNLP@ACL.

[9]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[10]  Erik Cambria,et al.  Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph , 2018, ACL.

[11]  Carlos Busso,et al.  IEMOCAP: interactive emotional dyadic motion capture database , 2008, Lang. Resour. Evaluation.

[12]  Rada Mihalcea,et al.  ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection , 2018, EMNLP.

[13]  Lun-Wei Ku,et al.  SocialNLP 2018 EmotionX Challenge Overview: Recognizing Emotions in Dialogues , 2018, SocialNLP@ACL.

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  Francis Y. L. Chin,et al.  EmotionX-DLC: Self-Attentive BiLSTM for Detecting Sequential Emotions in Dialogue , 2018, SocialNLP@ACL.

[16]  Xiaoyan Zhu,et al.  Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory , 2017, AAAI.

[17]  Jörg Tiedemann,et al.  OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles , 2016, LREC.

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Jinho D. Choi,et al.  Emotion Detection on TV Show Transcripts with Sequence-based Convolutional Neural Networks , 2017, AAAI Workshops.

[20]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[21]  Ido Dagan,et al.  context2vec: Learning Generic Context Embedding with Bidirectional LSTM , 2016, CoNLL.

[22]  Erik Cambria,et al.  Context-Dependent Sentiment Analysis in User-Generated Videos , 2017, ACL.

[23]  Erik Cambria,et al.  Conversational Memory Network for Emotion Recognition in Dyadic Dialogue Videos , 2018, NAACL.

[24]  Lun-Wei Ku,et al.  EmotionLines: An Emotion Corpus of Multi-Party Conversations , 2018, LREC.

[25]  Michael R. Lyu,et al.  Real-Time Emotion Recognition via Attention Gated Hierarchical Memory Network , 2019, AAAI.