Open-ended, Extensible System Utterances Are Preferred, Even If They Require Filled Pauses

In many environments (e. g. sports commentary), situations incrementally unfold over time and often the future appearance of a relevant event can be predicted, but not in all its details or precise timing. We have built a simulation framework that uses our incremental speech synthesis component to assemble in a timely manner complex commentary utterances. In our evaluation, the resulting output is preferred over that from a baseline system that uses a simpler commenting strategy. Even in cases where the incremental system overcommits temporally and requires a filled pause to wait for the upcoming event, the system is preferred over the baseline.

[1]  Jens Edlund Incremental speech synthesis , 2008 .

[2]  Helen F. Hastie,et al.  Optimising Incremental Generation for Spoken Dialogue Systems: Reducing the Need for Fillers , 2012, INLG.

[3]  W. Levelt,et al.  Speaking: From Intention to Articulation , 1990 .

[4]  Thierry Dutoit,et al.  PHTS FOR MAX/MSP: A STREAMING ARCHITECTURE FOR STATISTICAL PARAMETRIC SPEECH SYNTHESIS , 2011 .

[5]  Timo Baumann,et al.  Generating Situated Assisting Utterances to Facilitate Tactile-Map Understanding: A Prototype System , 2012, SLPAT@HLT-NAACL.

[6]  付伶俐 打磨Using Language,倡导新理念 , 2014 .

[7]  David Schlangen,et al.  Evaluating Prosodic Processing for Incremental Speech Synthesis , 2012, INTERSPEECH.

[8]  Gabriel Skantze,et al.  Towards Incremental Speech Generation in Dialogue Systems , 2010, SIGDIAL Conference.

[9]  Christopher Habel,et al.  Linking Spatial Haptic Perception to Linguistic Representations: Assisting Utterances for Tactile-Map Explorations , 2011, COSIT.

[10]  David Schlangen,et al.  INPRO_iSS: A Component for Just-In-Time Incremental Speech Synthesis , 2012, ACL.

[11]  Raymond J. Mooney,et al.  Learning to sportscast: a test of grounded language acquisition , 2008, ICML '08.

[12]  Gabriel Skantze,et al.  A General, Abstract Model of Incremental Dialogue Processing , 2009, EACL.

[13]  Stefan Kopp,et al.  Combining Incremental Language Generation and Incremental Speech Synthesis for Adaptive Information Presentation , 2012, SIGDIAL Conference.

[14]  David Schlangen,et al.  The InproTK 2012 release , 2012, SDCTD@NAACL-HLT.