论文信息 - Get The Point of My Utterance! Learning Towards Effective Responses with Multi-Head Attention Mechanism

Get The Point of My Utterance! Learning Towards Effective Responses with Multi-Head Attention Mechanism

Attention mechanism has become a popular and widely used component in sequence-to-sequence models. However, previous research on neural generative dialogue systems always generates universal responses, and the attention distribution learned by the model always attends to the same semantic aspect. To solve this problem, in this paper, we propose a novel Multi-Head Attention Mechanism (MHAM) for generative dialog systems, which aims at capturing multiple semantic aspects from the user utterance. Further, a regularizer is formulated to force different attention heads to concentrate on certain aspects. The proposed mechanism leads to more informative, diverse, and relevant response generated. Experimental results show that our proposed model outperforms several strong baselines.

[1] Hang Li,et al. Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[2] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[3] Bowen Zhou,et al. A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[4] Joelle Pineau,et al. How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[5] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[6] Jianfeng Gao,et al. A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[7] Jason Weston,et al. A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[8] Rui Yan,et al. "Shall I Be Your Chat Companion?": Towards an Online Human-Computer Conversation System , 2016, CIKM.

[9] Dongyan Zhao,et al. RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems , 2017, AAAI.

[10] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[11] Wei-Ying Ma,et al. Topic Aware Neural Response Generation , 2016, AAAI.

[12] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[13] Zhen Xu,et al. Neural Response Generation via GAN with an Approximate Embedding Layer , 2017, EMNLP.

[14] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[15] Dongyan Zhao,et al. How to Make Context More Useful? An Empirical Study on Context-Aware Neural Conversational Models , 2017, ACL.

[16] Dongyan Zhao,et al. Towards Implicit Content-Introducing for Generative Short-Text Conversation Systems , 2017, EMNLP.

[17] Rui Yan,et al. Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation , 2016, COLING.

[18] Daniel Jurafsky,et al. A Simple, Fast Diverse Decoding Algorithm for Neural Generation , 2016, ArXiv.

[19] M. V. Rossum,et al. In Neural Computation , 2022 .