Bootstrapping Development of a Cloud-Based Spoken Dialog System in the Educational Domain From Scratch Using Crowdsourced Data

We propose a crowdsourcing-based framework to iteratively and rapidly bootstrap a dialog system from scratch for a new domain. We leverage the open-source modular HALEF dialog system to deploy dialog applications. We illustrate the usefulness of this framework using four different prototype dialog items with applications in the educational domain and present initial results and insights from this endeavor.

[1]  Ian Frank,et al.  For a fistful of dollars: using crowd-sourcing to evaluate a spoken language CALL application , 2011, SLaTE.

[2]  Milica Gasic,et al.  Real User Evaluation of Spoken Dialogue Systems Using Amazon Mechanical Turk , 2011, INTERSPEECH.

[3]  Maria Klara Wolters,et al.  Evaluating speech synthesis intelligibility using Amazon Mechanical Turk , 2010, SSW.

[4]  Giuseppe Carenini,et al.  Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue , 2010 .

[5]  Max Mühlhäuser,et al.  JVoiceXML as a modality component in the W3C multimodal architecture , 2013, Journal on Multimodal User Interfaces.

[6]  Jim Van Meggelen,et al.  Asterisk: The Future of Telephony , 2005 .

[7]  David Suendermann-Oeft,et al.  HALEF: An Open-Source Standard-Compliant Telephony-Based Modular Spoken Dialog System: A Review and An Outlook , 2015, IWSDS.

[8]  James R. Glass,et al.  Collecting Voices from the Cloud , 2010, LREC.

[9]  R. Pieraccini,et al.  “How am I Doing?”: A New Framework to Effectively Measure the Performance of Automated Customer Care Contact Centers , 2010 .

[10]  David Suendermann-Oeft,et al.  A distributed cloud-based dialog system for conversational application development , 2015, SIGDIAL Conference.

[11]  Marc Schröder,et al.  The German Text-to-Speech Synthesis System MARY: A Tool for Research, Development and Teaching , 2003, Int. J. Speech Technol..

[12]  Andreas Wendemuth,et al.  Zanzibar OpenIVR: An Open-Source Framework for Development of Spoken Dialog Systems , 2011, TSD.

[13]  Sabine Buchholz,et al.  Crowdsourcing Preference Tests, and How to Detect Cheating , 2011, INTERSPEECH.

[14]  Manfred K. Warmuth,et al.  THE CMU SPHINX-4 SPEECH RECOGNITION SYSTEM , 2001 .

[15]  David Suendermann,et al.  How to drink from a fire hose: one person can annoscribe 693 thousand utterances in one month , 2010, SIGDIAL 2010.

[16]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[17]  Klaus Zechner,et al.  Using Amazon Mechanical Turk for Transcription of Non-Native Speech , 2010, Mturk@HLT-NAACL.

[18]  Paul Taylor,et al.  The architecture of the Festival speech synthesis system , 1998, SSW.