Advancing the State of the Art in Open Domain Dialog Systems through the Alexa Prize

Building open domain conversational systems that allow users to have engaging conversations on topics of their choice is a challenging task. Alexa Prize was launched in 2016 to tackle the problem of achieving natural, sustained, coherent and engaging open-domain dialogs. In the second iteration of the competition in 2018, university teams advanced the state of the art by using context in dialog models, leveraging knowledge graphs for language understanding, handling complex utterances, building statistical and hierarchical dialog managers, and leveraging model-driven signals from user responses. The 2018 competition also included the provision of a suite of tools and models to the competitors including the CoBot (conversational bot) toolkit, topic and dialog act detection models, conversation evaluators, and a sensitive content detection model so that the competing teams could focus on building knowledge-rich, coherent and engaging multi-turn dialog systems. This paper outlines the advances developed by the university teams as well as the Alexa Prize team to achieve the common goal of advancing the science of Conversational AI. We address several key open-ended problems such as conversational speech recognition, open domain natural language understanding, commonsense reasoning, statistical dialog management, and dialog evaluation. These collaborative efforts have driven improved experiences by Alexa users to an average rating of 3.61, the median duration of 2 mins 18 seconds, and average turns to 14.6, increases of 14%, 92%, 54% respectively since the launch of the 2018 competition. For conversational speech recognition, we have improved our relative Word Error Rate by 55% and our relative Entity Error Rate by 34% since the launch of the Alexa Prize. Socialbots improved in quality significantly more rapidly in 2018, in part due to the release of the CoBot toolkit.

[1]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[2]  Rahul Goel,et al.  On Evaluating and Comparing Conversational Agents , 2018, ArXiv.

[3]  Angeliki Metallinou,et al.  Topic-based Evaluation for Conversational Bots , 2018, ArXiv.

[4]  Anna Maria Di Sciullo,et al.  Natural Language Understanding , 2009, SoMeT.

[5]  Chaitanya Malaviya,et al.  Building CMU Magnus from User Feedback , 2017 .

[6]  Nan Rosemary Ke,et al.  The Octopus Approach to the Alexa Competition : A Deep Ensemble-based Socialbot , 2017 .

[7]  Fethiye Irmak Dogan,et al.  Fantom: A Crowdsourced Social Chatbot using an Evolving Dialog Graph , 2018 .

[8]  Jason Weston,et al.  Weakly Supervised Memory Networks , 2015, ArXiv.

[9]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[10]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[11]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[12]  Jianfeng Gao,et al.  A Neural Network Approach to Context-Sensitive Generation of Conversational Responses , 2015, NAACL.

[13]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[14]  Oliver Lemon,et al.  Alana: Social Dialogue using an Ensemble Model and a Ranker trained on User Feedback , 2017 .

[15]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[16]  Jennifer Chu-Carroll,et al.  MIMIC: An Adaptive Mixed Initiative Spoken Dialogue System for Information Queries , 2000, ANLP.

[17]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[18]  Amita Misra,et al.  Slugbot: An Application of a Novel and Scalable Open Domain Socialbot Framework , 2018, ArXiv.

[19]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[20]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[21]  Willi-Hans Steeb,et al.  Finite State Machines , 2001 .

[22]  Nobuhiro Kaji,et al.  Predicting Causes of Reformulation in Intelligent Assistants , 2017, SIGDIAL Conference.

[23]  Walter Daelemans,et al.  Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) , 2014, EMNLP 2014.

[24]  Bonnie L. Webber,et al.  Edina: Building an Open Domain Socialbot with Self-dialogues , 2017, ArXiv.

[25]  Edmund A. Mennis The Wisdom of Crowds: Why the Many Are Smarter than the Few and How Collective Wisdom Shapes Business, Economies, Societies, and Nations , 2006 .

[26]  Geoffrey Zweig,et al.  Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning , 2017, ACL.

[27]  Petr Marek,et al.  Alquist 2.0: Alexa Prize Socialbot Based on Sub-Dialogue Models , 2020, ArXiv.

[28]  Antoine Raux,et al.  The Dialog State Tracking Challenge Series: A Review , 2016, Dialogue Discourse.

[29]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[30]  James F. Allen,et al.  TRAINS-95: Towards a Mixed-Initiative Planning Assistant , 1996, AIPS.

[31]  Lijun Wu,et al.  Achieving Human Parity on Automatic Chinese to English News Translation , 2018, ArXiv.

[32]  Eugene Agichtein,et al.  IrisBot : An Open-Domain Conversational Bot for Personalized Information Access , 2018 .

[33]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[34]  Alexander I. Rudnicky,et al.  Tartan: A retrieval-based socialbot powered by a dynamic finite-state machine architecture , 2018, ArXiv.

[35]  Marilyn A. Walker,et al.  SlugBot: Developing a Computational Model andFramework of a Novel Dialogue Genre , 2019, ArXiv.

[36]  Sanjeev Arora,et al.  Pixie: A Social Chatbot , 2017 .

[37]  Oliver Lemon,et al.  Alana v2: Entertaining and Informative Open-domain Social Dialogue using Ontologies and Entity Linking , 2018 .

[38]  Mikhail Burtsev,et al.  DeepPavlov: An Open Source Library for Conversational AI , 2018 .

[39]  Yi Pan,et al.  Conversational AI: The Science Behind the Alexa Prize , 2018, ArXiv.

[40]  Rahul Goel,et al.  Detecting Offensive Content in Open-domain Conversations using Two Stage Semi-supervision , 2018, ArXiv.

[41]  Marilyn A. Walker,et al.  PARADISE: A Framework for Evaluating Spoken Dialogue Agents , 1997, ACL.

[42]  Andrew N. Carr,et al.  BYU-EVE: Mixed Initiative Dialog via Structured Knowledge Graph Traversal and Conversational Scaffolding , 2018 .

[43]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[44]  Ji-Rong Wen,et al.  An Inference Approach to Basic Level of Categorization , 2015, CIKM.

[45]  Björn Hoffmeister,et al.  Just ASK: Building an Architecture for Extensible Self-Service Spoken Language Understanding , 2017, ArXiv.

[46]  Cristian Danescu-Niculescu-Mizil,et al.  Chameleons in Imagined Conversations: A New Approach to Understanding Coordination of Linguistic Style in Dialogs , 2011, CMCL@ACL.

[47]  Mari Ostendorf,et al.  Sounding Board: A User-Centric and Content-Driven Social Chatbot , 2018, NAACL.

[48]  Joseph Weizenbaum,et al.  ELIZA—a computer program for the study of natural language communication between man and machine , 1966, CACM.

[49]  Nick Pawlowski,et al.  Rasa: Open Source Language Understanding and Dialogue Management , 2017, ArXiv.

[50]  Dilek Z. Hakkani-Tür,et al.  Towards Coherent and Engaging Spoken Dialog Response Generation Using Automatic Conversation Evaluators , 2019, INLG.

[51]  Geoffrey E. Hinton,et al.  Generating Text with Recurrent Neural Networks , 2011, ICML.

[52]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[53]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[54]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[55]  Alexander I. Rudnicky,et al.  RubyStar: A Non-Task-Oriented Mixture Model Dialog System , 2017, ArXiv.

[56]  Hector J. Levesque,et al.  Common Sense, the Turing Test, and the Quest for Real AI , 2017 .

[57]  Hoang Long Nguyen,et al.  Alquist: The Alexa Prize Socialbot , 2018, ArXiv.

[58]  Geoffrey Zweig,et al.  Achieving Human Parity in Conversational Speech Recognition , 2016, ArXiv.

[59]  Mari Ostendorf,et al.  LSTM based Conversation Models , 2016, ArXiv.

[60]  Umut Ozertem,et al.  Characterizing and Predicting Voice Query Reformulation , 2015, CIKM.

[61]  K. Á. T.,et al.  Towards a tool for the Subjective Assessment of Speech System Interfaces (SASSI) , 2000, Natural Language Engineering.

[62]  Ariya Rastrow,et al.  Contextual Language Model Adaptation for Conversational Agents , 2018, INTERSPEECH.

[63]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[64]  Rahul Goel,et al.  Contextual Topic Modeling For Dialog Systems , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).