Utilizing Crowdsourced Asynchronous Chat for Efficient Collection of Dialogue Dataset

In this paper, we design a crowd-powered system to efficiently collect data for training dialogue systems. Conventional systems assign dialogue roles to a pair of crowd workers, and record their interaction on an online chat. In this framework, the pair is required to work simultaneously, and one worker must wait for the other when he/she is writing a message, which decreases work efficiency. Our proposed system allows multiple workers to create dialogues in an asynchronous manner, which relieves workers from time restrictions. We have conducted an experiment using our system on a crowdsourcing platform to evaluate the efficiency and the quality of dialogue collection. Results show that our system can reduce the necessary time to input a message by 68% while maintaining quality.

[1]  Ryuichiro Higashinaka,et al.  Large-scale Collection and Analysis of Personal Question-Answer Pairs for Conversational Agents , 2014, IVA.

[2]  Takayuki Kanda,et al.  A conversational robot in an elderly care center: An ethnographic study , 2011, 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[3]  Timothy W. Bickmore,et al.  Designing Relational Agents as Long Term Social Companions for Older Adults , 2012, IVA.

[4]  Kei Uchiumi,et al.  System Utterance Generation by Label Propagation over Association Graph of Words and Utterance Patterns for Open-Domain Dialogue Systems , 2015, PACLIC.

[5]  Eric Horvitz,et al.  Crowdsourcing the acquisition of natural language corpora: Methods and observations , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[6]  Jeffrey Nichols,et al.  Chorus: a crowd-powered conversational assistant , 2013, UIST.

[7]  Hong Yu,et al.  AskHERMES: An online question answering system for complex clinical questions , 2011, J. Biomed. Informatics.

[8]  Joseph Weizenbaum,et al.  and Machine , 1977 .

[9]  Michael F. McTear,et al.  Book Review: Spoken Dialogue Technology: Toward the Conversational User Interface, by Michael F. McTear , 2002, CL.

[10]  Giuseppe Riccardi,et al.  Generative and discriminative algorithms for spoken language understanding , 2007, INTERSPEECH.

[11]  Duncan J. Watts,et al.  Financial incentives and the "performance of crowds" , 2009, HCOMP '09.

[12]  Yasuo Kuniyoshi,et al.  Dialog System Using Real-Time Crowdsourcing and Twitter Large-Scale Corpus , 2012, SIGDIAL Conference.

[13]  Xiang Zhang,et al.  Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems , 2015, ICLR.

[14]  Michael S. Bernstein,et al.  Soylent: a word processor with a crowd inside , 2010, UIST.

[15]  Jennifer Chu-Carroll,et al.  Building Watson: An Overview of the DeepQA Project , 2010, AI Mag..

[16]  Joelle Pineau,et al.  Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses , 2017, ACL.

[17]  Matthew Henderson,et al.  Deep Neural Network Approach for the Dialog State Tracking Challenge , 2013, SIGDIAL Conference.

[18]  Sadao Kurohashi,et al.  “Dialog Navigator”: A Question Answering System Based on Large Text Knowledge Base , 2002, COLING.

[19]  Daqing He,et al.  How do users respond to voice input errors?: lexical and phonetic query reformulation in voice search , 2013, SIGIR.

[20]  Michael S. Bernstein,et al.  Embracing Error to Enable Rapid Crowdsourcing , 2016, CHI.

[21]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[22]  Amos Azaria,et al.  "Is There Anything Else I Can Help You With?" Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent , 2016, HCOMP.

[23]  S. Hart,et al.  Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research , 1988 .

[24]  Ron Burns,et al.  Development of the HRL Route Navigation Dialogue System , 2001, HLT.

[25]  Walter S. Lasecki,et al.  Conversations in the Crowd: Collecting Data for Task-Oriented Dialog Learning , 2013, Proceedings of the AAAI Conference on Human Computation and Crowdsourcing.

[26]  Oren Etzioni,et al.  Open question answering over curated and extracted knowledge bases , 2014, KDD.

[27]  Lori Lamel,et al.  Dialog in the RAILTEL telephone-based system , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[28]  Giovanni Pilato,et al.  Humorist Bot: Bringing Computational Humour in a Chat-Bot System , 2008, 2008 International Conference on Complex, Intelligent and Software Intensive Systems.

[29]  Jianfeng Gao,et al.  Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.