Personalized Dialogue Generation with Persona-Adaptive Attention

Persona-based dialogue systems aim to generate consistent responses based on historical context and predefined persona. Unlike conventional dialogue generation, the persona-based dialogue needs to consider both dialogue context and persona, posing a challenge for coherent training. Specifically, this requires a delicate weight balance between context and persona. To achieve that, in this paper, we propose an effective framework with Persona-Adaptive Attention (PAA), which adaptively integrates the weights from the persona and context information via our designed attention. In addition, a dynamic masking mechanism is applied to the PAA to not only drop redundant information in context and persona but also serve as a regularization mechanism to avoid overfitting. Experimental results demonstrate the superiority of the proposed PAA framework compared to the strong baselines in both automatic and human evaluation. Moreover, the proposed PAA approach can perform equivalently well in a low-resource regime compared to models trained in a full-data setting, which achieve a similar result with only 20% to 30% of data compared to the larger models trained in the full-data setting. To fully exploit the effectiveness of our design, we designed several variants for handling the weighted information in different ways, showing the necessity and sufficiency of our weighting and masking designs.

[1]  Xi Victoria Lin,et al.  OPT: Open Pre-trained Transformer Language Models , 2022, ArXiv.

[2]  Yongqiang Tian,et al.  DeepFD: Automated Fault Diagnosis and Localization for Deep Learning Programs , 2022, 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE).

[3]  Shuming Shi,et al.  A Model-agnostic Data Manipulation Method for Persona-based Dialogue Generation , 2022, ACL.

[4]  Jason Weston,et al.  Beyond Goldfish Memory: Long-Term Open-Domain Conversation , 2021, ACL.

[5]  Yan Wang,et al.  BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data , 2021, ACL.

[6]  Jian-Yun Nie,et al.  A Simple and Efficient Multi-Task Learning Approach for Conditioned Dialogue Generation , 2021, NAACL.

[7]  Mary Williamson,et al.  Recipes for Building an Open-Domain Chatbot , 2020, EACL.

[8]  Qian Liu,et al.  You Impress Me: Dialogue Generation via Mutual Persona Perception , 2020, ACL.

[9]  Jeremy Blackburn,et al.  The Pushshift Reddit Dataset , 2020, ICWSM.

[10]  Minlie Huang,et al.  A Pre-training Based Personalized Dialogue Generation Model with Persona-sparse Data , 2019, AAAI.

[11]  Zhen-Hua Ling,et al.  Dually Interactive Matching Network for Personalized Response Selection in Retrieval-Based Chatbots , 2019, EMNLP.

[12]  Joelle Pineau,et al.  The Second Conversational Intelligence Challenge (ConvAI2) , 2019, The NeurIPS '18 Competition.

[13]  Thomas Wolf,et al.  TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents , 2019, ArXiv.

[14]  Jason Weston,et al.  Dialogue Natural Language Inference , 2018, ACL.

[15]  Jason Weston,et al.  Personalizing Dialogue Agents: I have a dog, do you have pets too? , 2018, ACL.

[16]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[17]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[19]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[20]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[21]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[22]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .