The Adapter-Bot: All-In-One Controllable Conversational Model

Considerable progress has been made towards conversational models that generate coherent and fluent responses by training large language models on large dialogue datasets. These models have little or no control of the generated responses and miss two important features: continuous dialogue skills integration and seamlessly leveraging diverse knowledge sources. In this paper, we propose the Adapter-Bot, a dialogue model that uses a fixed backbone conversational model such as DialGPT (Zhang et al., 2019) and triggers on-demand dialogue skills (e.g., emphatic response, weather information, movie recommendation) via different adapters (Houlsby et al., 2019). Each adapter can be trained independently, thus allowing a continual integration of skills without retraining the entire model. Depending on the skills, the model is able to process multiple knowledge types, such as text, tables, and graphs, in a seamless manner. The dialogue skills can be triggered automatically via a dialogue manager, or manually, thus allowing high-level control of the generated responses. At the current stage, we have implemented 12 response styles (e.g., positive, negative etc.), 8 goal-oriented skills (e.g. weather information, movie recommendation, etc.), and personalized and emphatic responses. We evaluate our model using automatic evaluation by comparing it with existing state-of-the-art conversational models, and we have released an interactive system at adapter.bot.ust.hk.

[1]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[2]  Pascale Fung,et al.  Generating Empathetic Responses by Looking Ahead the User’s Sentiment , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Wanxiang Che,et al.  Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog , 2020, ACL.

[4]  Xiyuan Zhang,et al.  Proactive Human-Machine Conversation with Explicit Conversation Goal , 2019, ACL.

[5]  Xiaoyan Zhu,et al.  Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory , 2017, AAAI.

[6]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[7]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[8]  Richard Socher,et al.  Global-to-local Memory Pointer Networks for Task-Oriented Dialogue , 2019, ICLR.

[9]  Ming-Wei Chang,et al.  A Knowledge-Grounded Neural Conversation Model , 2017, AAAI.

[10]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[11]  Yang Feng,et al.  Incremental Transformer with Deliberation Decoder for Document Grounded Conversations , 2019, ACL.

[12]  Victor O. K. Li,et al.  Video-based Emotion Recognition Using Deeply-Supervised Neural Networks , 2018, ICMI.

[13]  Geoffrey E. Hinton,et al.  Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.

[14]  Lukasz Kaiser,et al.  One Model To Learn Them All , 2017, ArXiv.

[15]  Y-Lan Boureau,et al.  Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset , 2018, ACL.

[16]  Mona Attariyan,et al.  Parameter-Efficient Transfer Learning for NLP , 2019, ICML.

[17]  Xifeng Yan,et al.  Neural Assistant: Joint Action Prediction, Response Generation, and Latent Knowledge Reasoning , 2019, ArXiv.

[18]  Mitesh M. Khapra,et al.  Towards Exploiting Background Knowledge for Building Conversation Systems , 2018, EMNLP.

[19]  Christopher D. Manning,et al.  A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue , 2017, EACL.

[20]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[21]  Pascale Fung,et al.  CAiRE: An End-to-End Empathetic Chatbot , 2019, AAAI.

[22]  Jianfeng Gao,et al.  A Controllable Model of Grounded Response Generation , 2020, AAAI.

[23]  Dilek Z. Hakkani-Tür,et al.  Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations , 2019, INTERSPEECH.

[24]  Graham Neubig,et al.  Controlling Output Length in Neural Encoder-Decoders , 2016, EMNLP.

[25]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[26]  Minlie Huang,et al.  KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation , 2020, ACL.

[27]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[28]  Victor O.K. Li,et al.  Facial Action Unit Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution , 2020, AAAI.

[29]  Christopher D. Manning,et al.  Key-Value Retrieval Networks for Task-Oriented Dialogue , 2017, SIGDIAL Conference.

[30]  Peng Xu,et al.  Variational Transformers for Diverse Response Generation , 2020, ArXiv.

[31]  Yi-Shin Chen,et al.  CARER: Contextualized Affect Representations for Emotion Recognition , 2018, EMNLP.

[32]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[33]  Quoc V. Le,et al.  Towards a Human-like Open-Domain Chatbot , 2020, ArXiv.

[34]  R. French Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[35]  Rongzhong Lian,et al.  Learning to Select Knowledge for Response Generation in Dialog Systems , 2019, IJCAI.

[36]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[37]  Iyad Rahwan,et al.  Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm , 2017, EMNLP.

[38]  Nayeon Lee,et al.  Misinformation Has High Perplexity , 2020, ArXiv.

[39]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[40]  Natasha Jaques,et al.  Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems , 2019, NeurIPS.

[41]  Jason Yosinski,et al.  Plug and Play Language Models: A Simple Approach to Controlled Text Generation , 2020, ICLR.

[42]  Jason Weston,et al.  Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.

[43]  Samira Shaikh,et al.  Emotional Neural Language Generation Grounded in Situational Contexts , 2019, CCNLG.

[44]  Nikhil Gupta,et al.  Disentangling Language and Knowledge in Task-Oriented Dialogs , 2018, NAACL.

[45]  Jason Weston,et al.  Personalizing Dialogue Agents: I have a dog, do you have pets too? , 2018, ACL.

[46]  Yan Xu,et al.  CAiRE-COVID: A Question Answering and Multi-Document Summarization System for COVID-19 Research , 2020, ArXiv.

[47]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[48]  Seungwhan Moon,et al.  OpenDialKG: Explainable Conversational Reasoning with Attention-based Walks over Knowledge Graphs , 2019, ACL.

[49]  Jason Weston,et al.  The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents , 2020, ACL.

[50]  Pascale Fung,et al.  Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems , 2018, ACL.

[51]  Thomas Wolf,et al.  TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents , 2019, ArXiv.

[52]  Meina Song,et al.  KB-Transformer: Incorporating Knowledge into End-to-End Task-Oriented Dialog Systems , 2019, 2019 15th International Conference on Semantics, Knowledge and Grids (SKG).

[53]  Yangming Li,et al.  Entity-Consistent End-to-end Task-Oriented Dialogue System with KB Retriever , 2019, EMNLP.

[54]  Pascale Fung,et al.  Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning , 2020, EMNLP.

[55]  Mary Williamson,et al.  Recipes for Building an Open-Domain Chatbot , 2020, EACL.

[56]  Peng Xu,et al.  MoEL: Mixture of Empathetic Listeners , 2019, EMNLP.

[57]  Danish Contractor,et al.  2019 Formatting Instructions for Authors Using LaTeX , 2018 .

[58]  Xiaoyu Shen,et al.  DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset , 2017, IJCNLP.

[59]  Jianfeng Gao,et al.  DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation , 2020, ACL.

[60]  Jianfeng Gao,et al.  Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.

[61]  Jianfeng Gao,et al.  Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation , 2017, IJCNLP.

[62]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[63]  Lihong Li,et al.  Neural Approaches to Conversational AI , 2019, Found. Trends Inf. Retr..

[64]  Etsuko Ishii,et al.  Plug-and-Play Conversational Models , 2020, FINDINGS.

[65]  Jamin Shin,et al.  Attention over Parameters for Dialogue Systems , 2020, ArXiv.

[66]  Jason Weston,et al.  Improving Conditioning in Context-Aware Sequence to Sequence Models , 2019, ArXiv.

[67]  Etsuko Ishii,et al.  XPersona: Evaluating Multilingual Personalized Chatbot , 2020, NLP4CONVAI.

[68]  Joelle Pineau,et al.  The Second Conversational Intelligence Challenge (ConvAI2) , 2019, The NeurIPS '18 Competition.

[69]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[70]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[71]  Mary Williamson,et al.  Can You Put it All Together: Evaluating Conversational Agents’ Ability to Blend Skills , 2020, ACL.

[72]  Jason Weston,et al.  What makes a good conversation? How controllable attributes affect human judgments , 2019, NAACL.

[73]  Xing Shi,et al.  Hafez: an Interactive Poetry Generation System , 2017, ACL.

[74]  Jason Weston,et al.  Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.