Deploying Lifelong Open-Domain Dialogue Learning

Much of NLP research has focused on crowdsourced static datasets and the supervised learning paradigm of training once and then evaluating test performance. As argued in de Vries et al. (2020), crowdsourced data has the issues of lack of naturalness and relevance to real-world use cases, while the static dataset paradigm does not allow for a model to learn from its experiences of using language (Silver et al., 2013). In contrast, one might hope for machine learning systems that become more useful as they interact with people. In this work, we build and deploy a role-playing game, whereby human players converse with learning agents situated in an open-domain fantasy world. We show that by training models on the conversations they have with humans in the game the models progressively improve, as measured by automatic metrics and online engagement scores. This learning is shown to be more efficient than crowdsourced data when applied to conversations with real users, as well as being far cheaper to collect.

[1]  Jason Weston,et al.  Neural Text Generation with Unlikelihood Training , 2019, ICLR.

[2]  Mark B. Ring Continual learning in reinforcement environments , 1995, GMD-Bericht.

[3]  Jason Weston,et al.  Personalizing Dialogue Agents: I have a dog, do you have pets too? , 2018, ACL.

[4]  Jason Weston,et al.  Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation , 2020, EMNLP.

[5]  Luis von Ahn Games with a Purpose , 2006, Computer.

[6]  Jason Weston,et al.  Learning from Dialogue after Deployment: Feed Yourself, Chatbot! , 2019, ACL.

[7]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[8]  Andreas Oikonomou,et al.  A study of how different game play aspects can affect the popularity of role-playing video games , 2011, 2011 16th International Conference on Computer Games (CGAMES).

[9]  Jason Weston,et al.  Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[10]  Jason Weston,et al.  Learning Through Dialogue Interactions , 2016, ICLR.

[11]  Jason Weston,et al.  Dialogue Learning With Human-In-The-Loop , 2016, ICLR.

[12]  Christopher D. Manning,et al.  Towards Ecologically Valid Research on Language User Interfaces , 2020, ArXiv.

[13]  Xiaoyu Shen,et al.  DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset , 2017, IJCNLP.

[14]  P. Vorderer,et al.  Video Games and the Pleasures of Control , 2000 .

[15]  Jason Weston,et al.  What makes a good conversation? How controllable attributes affect human judgments , 2019, NAACL.

[16]  Jianfeng Gao,et al.  Challenges in Building Intelligent Open-domain Dialog Systems , 2019, ACM Trans. Inf. Syst..

[17]  Yi Pan,et al.  Conversational AI: The Science Behind the Alexa Prize , 2018, ArXiv.

[18]  Jason Weston,et al.  Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent , 2017, ICLR.

[19]  Y-Lan Boureau,et al.  Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset , 2018, ACL.

[20]  P. Vorderer,et al.  Media entertainment: The psychology of its appeal. , 2000 .

[21]  Roberto Pieraccini,et al.  A stochastic model of human-machine interaction for learning dialog strategies , 2000, IEEE Trans. Speech Audio Process..

[22]  Yejin Choi,et al.  The Curious Case of Neural Text Degeneration , 2019, ICLR.

[23]  Bing Liu,et al.  An End-to-End Trainable Neural Network Model with Belief Tracking for Task-Oriented Dialog , 2017, INTERSPEECH.

[24]  Mary Williamson,et al.  Recipes for Building an Open-Domain Chatbot , 2020, EACL.

[25]  Oliver Lemon,et al.  Reinforcement Learning for Adaptive Dialogue Systems - A Data-driven Methodology for Dialogue Management and Natural Language Generation , 2011, Theory and Applications of Natural Language Processing.

[26]  Joelle Pineau,et al.  A Deep Reinforcement Learning Chatbot , 2017, ArXiv.

[27]  Jason Weston,et al.  Build it Break it Fix it for Dialogue Safety: Robustness from Adversarial Human Attack , 2019, EMNLP.

[28]  Xinlei Chen,et al.  Never-Ending Learning , 2012, ECAI.

[29]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[30]  Mohit Bansal,et al.  Adversarial NLI: A New Benchmark for Natural Language Understanding , 2020, ACL.

[31]  Bing Liu,et al.  Lifelong and Interactive Learning of Factual Knowledge in Dialogues , 2019, SIGdial.

[32]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[33]  Dilek Z. Hakkani-Tür,et al.  Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems , 2018, NAACL.

[34]  Jason Weston,et al.  Learning to Speak and Act in a Fantasy Text Adventure Game , 2019, EMNLP.

[35]  Jason Weston,et al.  Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring , 2020, ICLR.

[36]  Jason Weston,et al.  ParlAI: A Dialog Research Software Platform , 2017, EMNLP.

[37]  Steve J. Young,et al.  A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies , 2006, The Knowledge Engineering Review.

[38]  Qiang Yang,et al.  Lifelong Machine Learning Systems: Beyond Learning Algorithms , 2013, AAAI Spring Symposium: Lifelong Machine Learning.