Keyword-Aware Transformers Network for Chinese Open-Domain Conversation Generation

The open-domain conversation generation task aims to generate contextually relevant and informative responses based on a given conversation history. A critical challenge in open-domain dialogs is the tendency of models to generate safe responses. Existing work has often incorporated keyword information in the conversation history for response generation to relieve this problem. However, these approaches interact weakly between responses and keywords or ignore the association between keyword extraction and conversation generation. In this paper, we propose a method based on a Keyword-Aware Transformers Network (KAT) that can fuse contextual keywords. Specifically, the model enables keywords and contexts to fully interact with responses for keyword semantic enhancement. We jointly model the keyword extraction task and the dialog generation task in a multi-task learning fashion. Experimental results of two Chinese open-domain dialogue datasets showed that our proposed model outperformed the methods in both semantic and non-semantic evaluation metrics, improving Coherence, Fluency, and Informativeness in manual evaluation.

[1]  Minlie Huang,et al.  EVA2.0: Investigating Open-domain Chinese Dialogue Systems with Large-scale Pre-training , 2022, Machine Intelligence Research.

[2]  Sanghoun Song,et al.  A pre-trained BERT for Korean medical natural language processing , 2022, Scientific Reports.

[3]  Yajing Sun,et al.  Control Globally, Understand Locally: A Global-to-Local Hierarchical Graph Network for Emotional Support Conversation , 2022, IJCAI.

[4]  Fandong Meng,et al.  Multimodal Incremental Transformer with Visual Grounding for Visual Dialogue Generation , 2021, FINDINGS.

[5]  Zhiyuan Liu,et al.  CPM-2: Large-scale Cost-effective Pre-trained Language Models , 2021, AI Open.

[6]  Zhiyuan Liu,et al.  Pre-Trained Models: Past, Present and Future , 2021, AI Open.

[7]  Huanqi Cao,et al.  CPM: A Large-scale Generative Chinese Pre-trained Language Model , 2020, AI Open.

[8]  Mary Williamson,et al.  Recipes for Building an Open-Domain Chatbot , 2020, EACL.

[9]  Xipeng Qiu,et al.  Pre-trained models for natural language processing: A survey , 2020, Science China Technological Sciences.

[10]  Minlie Huang,et al.  A Pre-training Based Personalized Dialogue Generation Model with Persona-sparse Data , 2019, AAAI.

[11]  Jianfeng Gao,et al.  DIALOGPT : Large-Scale Generative Pre-training for Conversational Response Generation , 2019, ACL.

[12]  Hua Wu,et al.  PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable , 2019, ACL.

[13]  Jianfeng Gao,et al.  Challenges in Building Intelligent Open-domain Dialog Systems , 2019, ACM Trans. Inf. Syst..

[14]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[15]  Fumin Shen,et al.  Chat More: Deepening and Widening the Chatting Topic via A Deep Model , 2018, SIGIR.

[16]  Xiaoyu Shen,et al.  DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset , 2017, IJCNLP.

[17]  Wei-Ying Ma,et al.  Topic Aware Neural Response Generation , 2016, AAAI.

[18]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.