论文信息 - Looking at the Last Two Turns, I'd Say This Dialogue Is Doomed - Measuring Dialogue Success

Looking at the Last Two Turns, I'd Say This Dialogue Is Doomed - Measuring Dialogue Success

Two sets of linguistic features are developed: The first one to estimate if a single step in a dialogue between a human being and a machine is successful or not. The second set to classify dialogues as a whole. The features are based on Part-of-Speech-Labels (POS), word statistics and properties of turns and dialogues. Experiments were carried out on the SympaFly corpus, data from a real application in the flight booking domain. A single dialogue step could be classified with an accuracy of 83% (class-wise averaged recognition rate). The recognition rate for whole dialogues was 85%.

[1] Gina-Anne Levow,et al. Characterizing and Recognizing Spoken Corrections in Human-Computer Dialogue , 1998, ACL.

[2] Julia Hirschberg,et al. Prosodic cues to recognition errors , 1999 .

[3] Marilyn A. Walker,et al. Learning to Predict Problematic Situations in a Spoken Dialogue System: Experiments with How May I Help You? , 2000, ANLP.

[4] Elmar Nöth,et al. Comparison and Combination of Confidence Measures , 2002, TSD.

[5] Anton Batliner. User states, user strategies, and system performance: how to match the one with the other , 2003 .