Dependencies between Student State and Speech Recognition Problems in Spoken Tutoring Dialogues

Speech recognition problems are a reality in current spoken dialogue systems. In order to better understand these phenomena, we study dependencies between speech recognition problems and several higher level dialogue factors that define our notion of student state: frustration/anger, certainty and correctness. We apply Chi Square (X2) analysis to a corpus of speech-based computer tutoring dialogues to discover these dependencies both within and across turns. Significant dependencies are combined to produce interesting insights regarding speech recognition problems and to propose new strategies for handling these problems. We also find that tutoring, as a new domain for speech applications, exhibits interesting tradeoffs and new factors to consider for spoken dialogue design.

[1]  Michael Kearns,et al.  CobotDS: a spoken dialogue system for chat , 2002, AAAI/IAAI.

[2]  M. Swerts,et al.  Audiovisual prosody and feeling of knowing , 2005 .

[3]  Kate Forbes-Riley,et al.  Using Bigrams to Identify Relationships Between Student Certainness States and Tutor Responses in a Spoken Dialogue Corpus , 2005, SIGDIAL.

[4]  Diane J. Litman,et al.  Annotating Student Emotional States in Spoken Tutoring Dialogues , 2004, SIGDIAL Workshop.

[5]  Diane J. Litman,et al.  Interactions between speech recognition problems and user emotions , 2005, INTERSPEECH.

[6]  LitmanDiane,et al.  Towards developing general models of usability with PARADISE , 2000 .

[7]  Andreas Stolcke,et al.  Prosody-based automatic detection of annoyance and frustration in human-computer dialog , 2002, INTERSPEECH.

[8]  Carolyn Penstein Rosé,et al.  The Architecture of Why2-Atlas: A Coach for Qualitative Physics Essay Writing , 2002, Intelligent Tutoring Systems.

[9]  Lin Lawrence Chase Blame assignment for errors made by large vocabulary speech recognizers , 1997, EUROSPEECH.

[10]  Brady Clark,et al.  Evaluating the Effectiveness of SCoT: A Spoken Conversational Tutor , 2004 .

[11]  Julia Hirschberg,et al.  Corrections in spoken dialogue systems , 2000, INTERSPEECH.

[12]  Oliver Lemon,et al.  Combining Acoustic and Pragmatic Features to Predict Recognition Performance in Spoken Dialogue Systems , 2004, ACL.

[13]  Mari Ostendorf,et al.  Error-correction detection and response generation in a spoken dialogue system , 2005, Speech Commun..

[14]  Hagen Soltau,et al.  Specialized acoustic models for hyperarticulated speech , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[15]  Julia Hirschberg,et al.  Prosodic and other cues to speech recognition failures , 2004, Speech Commun..

[16]  Julia Hirschberg,et al.  Detecting certainness in spoken tutorial dialogues , 2005, INTERSPEECH.

[17]  Gabriel Skantze,et al.  Exploring human error recovery strategies: Implications for spoken dialogue systems , 2005, Speech Communication.

[18]  Marilyn A. Walker,et al.  Quantitative and Qualitative Evaluation of Darpa Communicator Spoken Dialogue Systems , 2001, ACL.

[19]  Marilyn A. Walker,et al.  Towards developing general models of usability with PARADISE , 2000, Natural Language Engineering.

[20]  Oliver Lemon,et al.  Reinforcement learning of dialogue strategies using the user's last dialogue act , 2005 .