Evorus: A Crowd-powered Conversational Assistant Built to Automate Itself Over Time

Crowd-powered conversational assistants have been shown to be more robust than automated systems, but do so at the cost of higher response latency and monetary costs. A promising direction is to combine the two approaches for high quality, low latency, and low cost solutions. In this paper, we introduce Evorus, a crowd-powered conversational assistant built to automate itself over time by (i) allowing new chatbots to be easily integrated to automate more scenarios, (ii) reusing prior crowd answers, and (iii) learning to automatically approve response candidates. Our 5-month-long deployment with 80 participants and 281 conversations shows that Evorus can automate itself without compromising conversation quality. Crowd-AI architectures have long been proposed as a way to reduce cost and latency for crowd-powered systems; Evorus demonstrates how automation can be introduced successfully in a deployed system. Its architecture allows future researchers to make further innovation on the underlying automated components in the context of a deployed open domain dialog system.

[1]  Aniket Kittur,et al.  The Knowledge Accelerator: Big Picture Thinking in Small Pieces , 2016, CHI.

[2]  Michael S. Bernstein,et al.  Flock: Hybrid Crowd-Machine Learning Classifiers , 2015, CSCW.

[3]  Haizhou Li,et al.  IRIS: a Chat-oriented Dialogue System based on the Vector Space Model , 2012, ACL.

[4]  Marilyn A. Walker,et al.  PARADISE: A Framework for Evaluating Spoken Dialogue Agents , 1997, ACL.

[5]  Maxine Eskénazi,et al.  DialPort: Connecting the spoken dialog research community to real user data , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[6]  Jeffrey Nichols,et al.  Chorus: a crowd-powered conversational assistant , 2013, UIST.

[7]  David Vandyke,et al.  Multi-domain Neural Network Language Generation for Spoken Dialogue Systems , 2016, NAACL.

[8]  Nicholas R. Jennings,et al.  Efficient crowdsourcing of unknown experts using bounded multi-armed bandits , 2014, Artif. Intell..

[9]  Michael S. Bernstein,et al.  Expert crowdsourcing with flash teams , 2014, UIST.

[10]  Jeffrey P. Bigham,et al.  Real-time On-Demand Crowd-powered Entity Extraction , 2017, ArXiv.

[11]  Jianfeng Gao,et al.  End-to-End Task-Completion Neural Dialogue Systems , 2017, IJCNLP.

[12]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[13]  Chei Hwee Chua,et al.  Quality of communication experience: definition, measurement, and implications for intercultural negotiations. , 2010, The Journal of applied psychology.

[14]  Tim Kraska,et al.  CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[15]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[16]  Lydia Manikonda,et al.  Complementing the Execution of AI Systems with Human Computation , 2017, AAAI Workshops.

[17]  Jeffrey P. Bigham,et al.  A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crowdsourcing , 2017, HCOMP.

[18]  Ting-Hao Huang,et al.  Guardian: A Crowd-Powered Spoken Dialog System for Web APIs , 2015, HCOMP.

[19]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[20]  Amos Azaria,et al.  "Is There Anything Else I Can Help You With?" Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent , 2016, HCOMP.

[21]  Gökhan Tür,et al.  End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding , 2016, INTERSPEECH.

[22]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[23]  Rob Miller,et al.  VizWiz: nearly real-time answers to visual questions , 2010, UIST.

[24]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[25]  Amos Azaria,et al.  InstructableCrowd: Creating IF-THEN Rules via Conversations with the Crowd , 2016, CHI Extended Abstracts.

[26]  Jennifer Widom,et al.  Surpassing Humans and Computers with JELLYBEAN: Crowd-Vision-Hybrid Counting Algorithms , 2015, HCOMP.

[27]  Alan Ritter,et al.  Data-Driven Response Generation in Social Media , 2011, EMNLP.

[28]  Eric Horvitz,et al.  Combining human and machine intelligence in large-scale crowdsourcing , 2012, AAMAS.

[29]  Maxine Eskénazi,et al.  Optimizing Endpointing Thresholds using Dialogue Features in a Spoken Dialogue System , 2008, SIGDIAL Workshop.

[30]  David Vandyke,et al.  Dialogue manager domain adaptation using Gaussian process reinforcement learning , 2016, Comput. Speech Lang..

[31]  Aniket Kittur,et al.  Alloy: Clustering with Crowds and Computation , 2016, CHI.

[32]  Gierad Laput,et al.  Zensors: Adaptive, Rapidly Deployable, Human-Intelligent Sensor Feeds , 2015, CHI.

[33]  Walter S. Lasecki,et al.  Answering visual questions with conversational crowd assistants , 2013, ASSETS.

[34]  Steve Young,et al.  Automatic learning of dialogue strategy using dialogue simulation and reinforcement learning , 2002 .

[35]  Fanglin Chen,et al.  WearMail: On-the-Go Access to Information in Your Email with a Privacy-Preserving Human Computation Workflow , 2017, UIST.

[36]  Dilek Z. Hakkani-Tür,et al.  End-to-end joint learning of natural language understanding and dialogue manager , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[37]  Timothy Baldwin,et al.  Lexical Normalisation of Short Text Messages: Makn Sens a #twitter , 2011, ACL.