Recipes for Building an Open-Domain Chatbot

Building open-domain chatbots is a challenging area for machine learning research. While prior work has shown that scaling neural models in the number of parameters and the size of the data they are trained on gives improved results, we highlight other ingredients. Good conversation requires blended skills: providing engaging talking points, and displaying knowledge, empathy and personality appropriately, while maintaining a consistent persona. We show that large scale models can learn these skills when given appropriate training data and choice of generation strategy. We build variants of these recipes with 90M, 2.7B and 9.4B parameter models, and make our models and code publicly available. Human evaluations show our best models outperform existing approaches in multi-turn dialogue on engagingness and humanness measurements. We then discuss the limitations of this work by analyzing failure cases of our models.

[1]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[2]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[3]  Jason Weston,et al.  Dialog-based Language Learning , 2016, NIPS.

[4]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[5]  Timothy Baldwin,et al.  Sequence Effects in Crowdsourced Annotations , 2017, EMNLP.

[6]  Learning Robust Dialog Policies in Noisy Environments , 2017, ArXiv.

[7]  Jason Weston,et al.  ParlAI: A Dialog Research Software Platform , 2017, EMNLP.

[8]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[9]  Ray Kurzweil,et al.  Learning Semantic Textual Similarity from Conversations , 2018, Rep4NLP@ACL.

[10]  Yann Dauphin,et al.  Hierarchical Neural Story Generation , 2018, ACL.

[11]  Hao Wu,et al.  Mixed Precision Training , 2017, ICLR.

[12]  Noam Shazeer,et al.  Adafactor: Adaptive Learning Rates with Sublinear Memory Cost , 2018, ICML.

[13]  Harry Shum,et al.  From Eliza to XiaoIce: challenges and opportunities with social chatbots , 2018, Frontiers of Information Technology & Electronic Engineering.

[14]  Jason Weston,et al.  Personalizing Dialogue Agents: I have a dog, do you have pets too? , 2018, ACL.

[15]  Yejin Choi,et al.  Learning to Write with Cooperative Discriminators , 2018, ACL.

[16]  Quoc V. Le,et al.  A Goal-oriented Neural Conversation Model by Self-Play , 2018 .

[17]  Gökhan Tür,et al.  Building a Conversational Agent Overnight with Dialogue Self-Play , 2018, ArXiv.

[18]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[19]  Antoine Bordes,et al.  Training Millions of Personalized Dialogue Agents , 2018, EMNLP.

[20]  Jason Weston,et al.  Retrieve and Refine: Improved Sequence Generation Models For Dialogue , 2018, SCAI@EMNLP.

[21]  Bing Liu,et al.  Bootstrapping a Neural Conversational Agent with Dialogue Self-Play, Crowdsourcing and On-Line Reinforcement Learning , 2018, NAACL.

[22]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[23]  MAKING KNIGHTS SMILE IN A FANTASY GAME WORLD , 2019 .

[24]  Jason Weston,et al.  ACUTE-EVAL: Improved Dialogue Evaluation with Optimized Questions and Multi-turn Comparisons , 2019, ArXiv.

[25]  Natasha Jaques,et al.  Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems , 2019, NeurIPS.

[26]  Lav R. Varshney,et al.  CTRL: A Conditional Transformer Language Model for Controllable Generation , 2019, ArXiv.

[27]  M. Shoeybi,et al.  Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism , 2019, ArXiv.

[28]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[29]  Joelle Pineau,et al.  The Second Conversational Intelligence Challenge (ConvAI2) , 2019, The NeurIPS '18 Competition.

[30]  Jason Weston,et al.  Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[31]  Y-Lan Boureau,et al.  Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset , 2018, ACL.

[32]  Jason Weston,et al.  Learning from Dialogue after Deployment: Feed Yourself, Chatbot! , 2019, ACL.

[33]  Jason Weston,et al.  Build it Break it Fix it for Dialogue Safety: Robustness from Adversarial Human Attack , 2019, EMNLP.

[34]  Quoc V. Le,et al.  GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism , 2018, ArXiv.

[35]  Yiming Yang,et al.  Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[36]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[37]  Thomas Wolf,et al.  TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents , 2019, ArXiv.

[38]  Jason Weston,et al.  Learning to Speak and Act in a Fantasy Text Adventure Game , 2019, EMNLP.

[39]  Myle Ott,et al.  fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.

[40]  Bing Liu,et al.  Mix-review: Alleviate Forgetting in the Pretrain-Finetune Framework for Neural Language Generation Models , 2019, ArXiv.

[41]  Fabio Petroni,et al.  How Decoding Strategies Affect the Verifiability of Generated Text , 2019, FINDINGS.

[42]  Lukasz Kaiser,et al.  Reformer: The Efficient Transformer , 2020, ICLR.

[43]  J. Weston,et al.  The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents , 2019, ACL.

[44]  Jeremy Blackburn,et al.  The Pushshift Reddit Dataset , 2020, ICWSM.

[45]  J. Weston,et al.  Queens Are Powerful Too: Mitigating Gender Bias in Dialogue Generation , 2019, EMNLP.

[46]  Dan Klein,et al.  Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers , 2020, ArXiv.

[47]  Jason Weston,et al.  Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring , 2020, ICLR.

[48]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[49]  Jason Weston,et al.  Neural Text Generation with Unlikelihood Training , 2019, ICLR.

[50]  Harry Shum,et al.  The Design and Implementation of XiaoIce, an Empathetic Social Chatbot , 2018, CL.

[51]  Mary Williamson,et al.  Can You Put it All Together: Evaluating Conversational Agents’ Ability to Blend Skills , 2020, ACL.

[52]  Arman Cohan,et al.  Longformer: The Long-Document Transformer , 2020, ArXiv.

[53]  Jianfeng Gao,et al.  DIALOGPT : Large-Scale Generative Pre-training for Conversational Response Generation , 2019, Annual Meeting of the Association for Computational Linguistics.

[54]  Jason Weston,et al.  I love your chain mail! Making knights smile in a fantasy game world: Open-domain goal-oriented dialogue agents , 2019, ArXiv.

[55]  Jacob Andreas,et al.  Experience Grounds Language , 2020, EMNLP.

[56]  Yejin Choi,et al.  The Curious Case of Neural Text Degeneration , 2019, ICLR.

[57]  Quoc V. Le,et al.  Towards a Human-like Open-Domain Chatbot , 2020, ArXiv.

[58]  Alec Radford,et al.  Scaling Laws for Neural Language Models , 2020, ArXiv.

[59]  J. Weston,et al.  Don’t Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood Training , 2019, ACL.

[60]  Timothy P. Lillicrap,et al.  Compressive Transformers for Long-Range Sequence Modelling , 2019, ICLR.