MoEL: Mixture of Empathetic Listeners

Previous research on empathetic dialogue systems has mostly focused on generating responses given certain emotions. However, being empathetic not only requires the ability of generating emotional responses, but more importantly, requires the understanding of user emotions and replying appropriately. In this paper, we propose a novel end-to-end approach for modeling empathy in dialogue systems: Mixture of Empathetic Listeners (MoEL). Our model first captures the user emotions and outputs an emotion distribution. Based on this, MoEL will softly combine the output states of the appropriate Listener(s), which are each optimized to react to certain emotions, and generate an empathetic response. Human evaluations on empathetic-dialogues (Rashkin et al., 2018) dataset confirm that MoEL outperforms multitask training baseline in terms of empathy, relevance, and fluency. Furthermore, the case study on generated responses of different Listeners shows high interpretability of our model.

[1]  Fei-Fei Li,et al.  Hierarchical Mixture of Classification Experts Uncovers Interactions between Brain Regions , 2009, NIPS.

[2]  Volker Tresp,et al.  Mixtures of Gaussian Processes , 2000, NIPS.

[3]  Puneet Agrawal,et al.  Understanding Emotions in Text Using Deep Learning and Big Data , 2019, Comput. Hum. Behav..

[4]  Pascale Fung,et al.  Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems , 2018, ACL.

[5]  Xiaoyan Zhu,et al.  Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory , 2017, AAAI.

[6]  Babak Shahbaba,et al.  Nonlinear Models Using Dirichlet Process Mixtures , 2007, J. Mach. Learn. Res..

[7]  Pascale Fung,et al.  Personalizing Dialogue Agents via Meta-Learning , 2019, ACL.

[8]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[9]  Jason Weston,et al.  Personalizing Dialogue Agents: I have a dog, do you have pets too? , 2018, ACL.

[10]  Y-Lan Boureau,et al.  I Know the Feeling: Learning to Converse with Empathy , 2018, ArXiv.

[11]  Marc Peter Deisenroth,et al.  Distributed Gaussian Processes , 2015, ICML.

[12]  Richard Socher,et al.  Global-to-local Memory Pointer Networks for Task-Oriented Dialogue , 2019, ICLR.

[13]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[14]  Matthias Bethge,et al.  Generative Image Modeling Using Spatial LSTMs , 2015, NIPS.

[15]  Antoine Bordes,et al.  Training Millions of Personalized Dialogue Agents , 2018, EMNLP.

[16]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[17]  Pascale Fung,et al.  Real-Time Speech Emotion and Sentiment Recognition for Interactive Dialogue Systems , 2016, EMNLP.

[18]  Ke Wang,et al.  SentiGAN: Generating Sentimental Texts via Mixture Adversarial Networks , 2018, IJCAI.

[19]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[20]  Jason Weston,et al.  Retrieve and Refine: Improved Sequence Generation Models For Dialogue , 2018, SCAI@EMNLP.

[21]  Kyunghyun Cho,et al.  Importance of Search and Evaluation Strategies in Neural Dialogue Modeling , 2018, INLG.

[22]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[23]  Pascale Fung,et al.  CAiRE_HKUST at SemEval-2019 Task 3: Hierarchical Attention for Dialogue Emotion Classification , 2019, SemEval@NAACL-HLT.

[24]  Jason Weston,et al.  Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[25]  Lukasz Kaiser,et al.  One Model To Learn Them All , 2017, ArXiv.

[26]  Pascale Fung,et al.  HappyBot: Generating Empathetic Dialogue Responses by Improving User Experience Look-ahead , 2019, ArXiv.

[27]  William Yang Wang,et al.  MojiTalk: Generating Emotional Responses at Scale , 2017, ACL.

[28]  Joelle Pineau,et al.  Generative Deep Neural Networks for Dialogue: A Short Review , 2016, ArXiv.

[29]  Nikhil Gupta,et al.  Disentangling Language and Knowledge in Task-Oriented Dialogs , 2018, NAACL.

[30]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[31]  Eric P. Xing,et al.  Toward Controlled Generation of Text , 2017, ICML.

[32]  Jason Weston,et al.  Learning from Dialogue after Deployment: Feed Yourself, Chatbot! , 2019, ACL.

[33]  Y-Lan Boureau,et al.  Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset , 2018, ACL.

[34]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[35]  Victor O. K. Li,et al.  Multi-Region Ensemble Convolutional Neural Network for Facial Expression Recognition , 2018, ICANN.

[36]  Boi Faltings,et al.  Personalization in Goal-Oriented Dialog , 2017, ArXiv.

[37]  Pascale Fung,et al.  End-to-End Dynamic Query Memory Network for Entity-Value Independent Task-Oriented Dialog , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[38]  Shuming Shi,et al.  Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory , 2018, NAACL.

[39]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[40]  Samy Bengio,et al.  A Parallel Mixture of SVMs for Very Large Scale Problems , 2001, Neural Computation.

[41]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[42]  Pascale Fung,et al.  Team yeon-zi at SemEval-2019 Task 4: Hyperpartisan News Detection by De-noising Weakly-labeled Data , 2019, SemEval@NAACL-HLT.

[43]  Thomas Wolf,et al.  TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents , 2019, ArXiv.

[44]  Zhoujun Li,et al.  Response Generation by Context-aware Prototype Editing , 2018, AAAI.

[45]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[46]  Jianfeng Gao,et al.  Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.

[47]  Satoshi Nakamura,et al.  Eliciting Positive Emotion through Affect-Sensitive Dialogue Response Generation: A Neural Network Approach , 2018, AAAI.

[48]  Dilek Z. Hakkani-Tür,et al.  DeepCopy: Grounded Response Generation with Hierarchical Pointer Networks , 2019, SIGdial.

[49]  Peng Xu,et al.  Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training , 2018, WASSA@EMNLP.

[50]  Victor O. K. Li,et al.  Unsupervised Domain Adaptation with Generative Adversarial Networks for Facial Emotion Recognition , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[51]  Joelle Pineau,et al.  The Second Conversational Intelligence Challenge (ConvAI2) , 2019, The NeurIPS '18 Competition.

[52]  Pascale Fung,et al.  End-to-End Recurrent Entity Network for Entity-Value Independent Goal-Oriented Dialog Learning , 2017 .

[53]  Kedhar Nath Narahari,et al.  SemEval-2019 Task 3: EmoContext Contextual Emotion Detection in Text , 2019, *SEMEVAL.

[54]  Tinne Tuytelaars,et al.  Expert Gate: Lifelong Learning with a Network of Experts , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Jason Weston,et al.  Importance of a Search Strategy in Neural Dialogue Modelling , 2018, ArXiv.

[56]  Victor O. K. Li,et al.  Video-based Emotion Recognition Using Deeply-Supervised Neural Networks , 2018, ICMI.

[57]  Danish Contractor,et al.  2019 Formatting Instructions for Authors Using LaTeX , 2018 .

[58]  Pascale Fung,et al.  Nora the Empathetic Psychologist , 2017, INTERSPEECH.

[59]  Lihong Li,et al.  Neural Approaches to Conversational AI , 2019, Found. Trends Inf. Retr..

[60]  Fei Sha,et al.  Aiming to Know You Better Perhaps Makes Me a More Engaging Dialogue Partner , 2018, CoNLL.

[61]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[62]  Jason Weston,et al.  Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[63]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[64]  Geoffrey E. Hinton,et al.  Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.