Characterizing task-oriented dialog using a simulated ASR chanel

We describe a data collection consisting of task-oriented human-human conversations in a simulated ASR channel in which the WER is systematically varied. We find that users infrequently give a direct indication of having been misunderstood; levels of expert “initiative” increase with WER primarily due to increased grounding activity; and asking task-related questions appears to be a more successful repair strategy at moderate WER levels. A PARADISE analysis finds task completion most predictive of user satisfaction; efficiency is also important at lower WERs.